Benjamin Saller
Abstract Edge
<bcsaller@ideasuite.com>Archetypes (formerly known as CMFTypes) is Zope Product which simplifys the creation of new content types under Zope 2 and the Content Management Framework (CMF). Many content management projects involve introducing new types of content, which in the non-trivial case requires an informed understanding of how Zope and the CMF work. Archetypes provides a simple, extensible framework that can ease both the development and maintenance costs of CMF content types while reducing the learning curve for the simpler cases.
The Zope/CMF world is populated with many projects that deal with multiple content types. Most of these deal with one or two aspects of content management, mainly specifying what kinds of data constitute a type and what might serve as an acceptable source of data for populating a content object. By these I mean defining what fields constitute a type (body, teaser, etc) and products that deal with converting rich original content (Office Products, PDF, Docbook, etc) into web ready forms.
Archetypes introduces a framework that provides some layering between concerns and provide a reasonable set of default policy and integration with Zope’s leading CMF implementation, Plone (www.plone.org).
Archetypes works by declaring data about the object using its custom schema definition format. A schema is simply a list of Field objects that describe the data and specify the various concerns that should be implemented/generated/dealt-with on behalf of the content objects. At its most basic a schema is very simple
This declares a schema with two fields, teaser and body. When inserted into a class this would cause each Field to be associated with accessors and mutators, generate edit and view forms, validation scripts, a unique ID for the object, the ability to handle references between objects and a host of other features.
While its interesting that little else is required you are also capable of defining a variety of policy in the schema and in the Python code referred to by the schema. Below is the same example with a bit more detail and policy.
By providing a bit more information we are able to make forms that indicate that body is required (with supporting validation), that both Fields contribute to the SearchableText (a way of saying what parts of a content object are “full text searchable” in CMF terms) and providing some descriptive labels for the generated forms. What is also interesting is that we have specified information for the body field that deals with content transformation, we have indicated that the Archetypes runtime should deal with content originating in all of the listed mime-types (which results in the automatic generation of an HTML version of the content as well as keeping the original). By requesting that the ‘RichWidget’ be used on the form we are given the chance to enter text into a textarea or upload a file in one of the know mime-types. It also exposes the field for use with External Editor allowing content authors to edit the Field locally using a content editor of their choosing and save updates directly to the resulting content object (with all of the transformation and cataloging concerns automatically addressed).
Archetypes continues by providing a set of abstractions of things like data type mapping and data storage. It is my hope that projects can provide custom implementations of Field for new data types, new types of transformations, new storage models and that these will be largely interchangeable and useful on other projects. The project provides simpler ways to set things like icons, actions and other elements associated with content types in the CMF.
Archetypes defines active schemas when creating new types. Rather than just defining the properties and policies of a new type the schema definition implements policy and binds it to different concerns. This has many interesting implications. Projects can define custom Field types that introduce policy for new Field types. This means that developers can build collections of policy that are useful in a variety of new and different types, something that is not as simple when writing custom accessors and mutators for each field by hand. It is an important goal of the Archetypes project that all of the inner-workings of the project become exposed at the programmatic level for adaptation to individual projects needs while not requiring changes to the core.
At the same time its quite possible for developers to implement their own policy and ignore most of the inner workings of the runtime system. The Archetypes framework carries a rich set of configurable behaviors but if the existing framework is not sufficient or cannot be extended then it makes every effort to stay out of the developers way. Standard Object Oriented techniques like using subclassing to specialize content types still works normally and developers can choose to provide their own implementation of almost any behavior. To facilitate this (and to ease the common case where the developer does wish to depend on the Archetypes system) a class generation layer is employed when ever a change to the schema is detected.
The Class Generator will inspect the schema, attempt to figure out names for accessors and mutators (either registered names or via a default naming convention) and then figure out if the methods need to be generated or not. This point is important, if methods with the expected names of the accessor and mutator exist on the class then nothing further is done, but if not the field is bound to the active schema using a generated method.
In Archetypes the default is that each class retains a pointer to the schema accessible though the Schema() method call. The schema grants runtime access to the names of the accessors and mutators as well as containing the field objects which hold the actual implementation.
The following shows access to the attributes at runtime and getting a handle to the accessor.
Field objects implement a get/set protocol which is what the automatically generated methods delegate to. Examples of what these generated method might look like follow:
A very simple Field level implementation of get is shown below.
Where self.getName() is the name of that field as it was registered in the schema. Here we delegate to the storage concern which can be defined/overridden in the schema again. A sample of what a storage get method might look like is shown below.
This simple layering allows for fields that might store their field level data in different ways or that require different data mapping before storage can be done (for example converting the field to an different data type, i.e. DateTime objects). It is possible then to use as little or much of the layering machinery as needed by a particular project or simply provide your own implementation.
As an example if you require a Field that normalizes it's data you could define your own field implementation that mapped the value before calling the storage layer, or simply write your own custom mutator. The example below might be the mutator for a field named 'foo', that converts its argument to a lowercased string.
Storage implementations exist for attributes, folderish (object managed), and special metadata storage, but could easily be extended so that certain fields (such as large Office documents) could be kept on the file-system or in a database.
Within a given/particular site, Archetypes based objects each register with the ArchetypeTool and are assigned a unique Id. This UID should remain consistent in the face of rename and move operations. References to other Archetypes based objects are possible and back-references are maintained.
Archetypes has a built in notion of validation for the object. Two forms of validation exist, the first and simplest is to delegate to the validation service that resides in the Archetypes CVS repository. Using this tool reusable elements of validation can be referred to in the schema. Using this tool is as simple as defining a tuple of named validators. For example to indicate that the field 'url' should validate as a URL we specify the following in the schema.
In many cases this is enough, but in cases where more complex validation is required (or is needed to suppliment the validators list) custom validators can easily be written. The API exposed is very simple and is automatically invoked during form submission. If we need to validate that the field 'number' were in some range we could easily write a method with the name validate_number.
If and when errors are returned they are automatically encoded for use in form level error reporting.
As an example I include a complete, but simple, example of using Archetypes from a new project. The project is available in the Archetypes CVS repository as ArchExample. I encourage you to look in CVS for versions of this example that track changes to the framework over time.
The following files are required to make a project function.
The complete code for Article follows:
Article uses the public API file and requires very little code to get going. This will create a type with a blurb and a TextArea widget, and a body field that allows for file uploads and a richer set of content types.
The __init__ file is equally simple.
This simply makes the type available to the CMF machinery in the normal way.
The external method, Install in the Extensions directory is also quite simple and uses utilities in the Archetypes project to do all the heavy lifting.
At the end of this you are left with a new type inside your Plone site. All the forms are dynamically rendered and are seamless to use. The new type can use references, transformations, and can easily be updated and extended with schema changes. The more complex a project and the more types it has the greater the gain. When requirements are volatile the cost of having to update multiple forms and code files can be quite costly. Hopefully this example shows just how little (in this case just the schema in Article) needs to be changed and updated.
Archetypes addresses with some success the creation of new content types in Zope 2, CMF and Plone. It has been met with an enthusiastic response within the Plone development community and offers a powerful set of features with out having to expressly deal with all the concerns that might be involved in dealing with large complex objects. There is currently talk about rebuilding the content elements of Plone to use all Archetypes based content types in its next release.
There are some powerful abstractions such as delegating access and mutation through the Fields. Data mapping and storage layers operate in a way that is seamless to the developer except when they explicitly need to address those concerns. There are a number of lessons learned for both the Zope 2 and Zope 3 development communities.
Archetypes is currently available as a released package in a CMF/Plone project on SourceForge known as the collective (which hosts a number of wonderful CMF/Plone products).
Archetypes http://sf.net/projects/archetypes
Plone http://plone.org/
Zope http://zope.org/
Python http://python.org/