"DITAworks is an Eclipse based solution which is built on DITA architecture and supports collaborative modeling, maintaining, publishing of complex documentation arrays."

*instinctools GmbH

DITA Models single-sourcing: Why not?

It is a well known fact that DITA is a great information architecture that enables single source publishing of content into different formats. “Write once and use everywhere” is a motto of it. But when it comes to DITA definition itself and DITA specializations, we have to make decisions about format we want to maintain our models in. We need to choose between DTD or XML Schema. Once this decision is made and the model is developed, it is quite a time consuming operation to change chosen format and in most cases, people would stay with originally selected format because of the effort and time involved.

DITA itself has to be supported in both formats in parallel.

But why not consider single-sourcing of DITA models? I think this would be logical step following the single-sourcing philosophy of DITA.

DTD and XML Schema are not the best notations to describe inheritance structures of DITA. These formats are chosen as notation for DITA just because they are the most popular formats in XML world. But both do not address all the needs of DITA modeling, especially validation of resulting models.

Even with valid specialization from DTD point of view, your model can be invalid from DITA point of view. And this invalid model will work perfectly with most of your XML tools and the first time you will notice that something is wrong will be the time when DITA Open Toolkit will produce some garbage output or generalization will not be possible to standard DITA types.

That’s why we can think about some kind of 3rd notation that fits for modeling needs around DITA better. Having DITA and specialization models defined in it, we can simply generate DTDs or XSDs on demand. We can extend this list to RELAX NG if needed.

Solution

Guided by this idea, DITAWorks development team tried to implement it. The new neutral notation we used was an EMF model defining structural constraints of XML and some specifics of DITA (containers, inheritance, domains and s.o.). Then, we added export to DTD and XML Schema. As a next logical step, Import from DTD and XML Schema followed.

In this way, we created a basis for single-sourcing of the DITA models. Models can be maintained now in this neutral format and exported to needed format on-demand. Next challenge that had to be addressed was an Editorial tooling for this source EMF model.

This challenge was addressed by introducing set of Visual editors for main DITA containers:

  • Infotypes (shell DTDs)
  • Modules
  • Domains
  • Packages

These editors provided editorial and validating functionality for main components of DITA and made specialization task much easier and fault tolerant.

But still, Specialization itself required an extensive understanding of DITA structure. And the most complicated part of it is that Information Architect always has to deal with single element definitions that redefine inherited structures. He never has an easy possibility to see what kind of document structure he is getting as a result of his specialization.

To address this problem, DITAWorks team developed special visual editor that shows infotype resulting from specialization in tree-form and displays changes made to it in respect to previous level of inheritance. Additionally, main specialization actions became available in wizard-like dialogs directly from this view.

DITA Intotype visual editor

DITA Intotype visual editor

In one of the upcoming posts I would like to explain functionality of Visual DITA editors in more detail.

Conclusion

By introduction of new XML structure notation and by providing visual tooling for it, DITAWorks Modeling components can enable single-sourcing for DITA Models as well as dramatically simplify process of model definition and maintenance.

DITA Model single-sourcing

DITA Model single-sourcing

Tags: , , , ,



2 Responses to “DITA Models single-sourcing: Why not?”

  1. ghkrause says:

    Hi Alex,
    May be I should google for it but there are so many three letter abbreviations using the very same letters. What is an EMF Model ???
    Any link to background information welcome.

    • leha says:

      EMF is an abbreviation for Eclipse Modeling Framework (http://www.eclipse.org/modeling/emf/).
      EMF is used in DITAWorks as master data structure for DITA model information.

Leave a Reply

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Click to hear an audio file of the anti-spam word

 

Copyright © 2008-2010 * instinctools GmbH