Choosing an XML Model

Posted by Mary McRae on Mar 24, 2017 10:17:16 AM


After seeing my infographic above, one of my colleagues asked me why I am so passionate about JATS (Journal Article Tag Suite NISO Z39.96) and its sister tag suite, BITS.  There are two big reasons: first, their suitability for the task, and second, the dedicated team of experts that designed them and continue to develop them.  That doesn’t mean I’m not an advocate for other tag suites.  I am, so long as they serve my passion about XML and using the right vocabulary for the job at hand.

The JATS standard has been developed for the STM (Scientific, Technical, and Medical) journal publishing community and contains the markup necessary for publishers to develop, manage, produce, and deliver research articles.  There are 3 flavors, listed from most to least restrictive:

  • Article Authoring (Orange in the infographic)
  • Journal Publishing(Blue in the infographic)
  • Journal Archiving and Interchange (Green in the infographic)

The vast majority of JATS users work with JATS Blue.  BITS, on the other hand, extends the JATS model for book-like structures and includes tags for front matter, back matter, parts, and chapters.  A third tag suite based on JATS is in use at ISO and is being standardized for use by international standards bodies and SDOs (Standards Development Organizations), which submit their standards to the international bodies for approval.

Which one should I use?

If you’re a journal publisher, any or all of these forms of JATS might be the right tag set for your journals.  It really depends on your particular situation.

Take authoring, for example.  For most journal publishers, content is written by researchers who don’t author in XML, so there’s little need for the JATS Authoring DTD.  These publishers might transform the original content (often supplied in Microsoft Word) to the JATS Journal DTD (Blue) after the content creation process.  Other publishers have found they need to restrict the way their authors create content and require them to use the JATS Authoring DTD (Orange).  Publishers converting an older, print-only back catalog may choose to use the JATS Archiving and Interchange DTD (Green), since its loose structure supports varying styles used over the years and does not require rework.

It’s not uncommon for a publisher to use multiple flavors of JATS and BITS and then to customize them.  What makes the most sense for your team depends on your content, your workflows, and the platforms that host your content; you may have to consider alternatives to JATS.

How do I choose the one DTD that works for everyone in my organization?

Sometimes you don’t choose only one!

It may make business sense to force everyone to use the same model, but it’s certainly not required.  Many of our clients use multiple DTDs effectively for different communities of users within their organizations–the key is how you build your environment.

Start by finding yourself a good consultant


Contact me at and I can help you find a consultant.
  The best way to find out what you need is to hire a business analyst/content architect.  After he learns about your business objectives and does a thorough analysis of your content, workflows, and processes, he can recommend the best  vocabularies and customizations to be deployed at each stage of the process.  He can help you develop transforms to make your tag sets interoperable and to make it easy to go from your authoring DTD to whatever other publishing and delivery models your organization requires.

Invest in technologies to facilitate your choices

If you publish scientific articles, journals, or related content, there are a number of great tools on the market that support JATS and JATS-based DTDs out-of-the-box.  Others help you to build the transforms needed to maximize your efficiency and create seamless delivery between authoring, publishing, and archiving.  And of course, I can’t finish without pointing out the usefulness of a native-XML publishing automation system like RSuite to bring everything together.

RSuite Beyond CMS white background.gif

Useful Resources

Topics: DITA, XML, XML Schema, DTD, JATS, S1000D, BITS

RSuite User Group Online Meeting

Posted by David M. Turner on Jul 21, 2016 10:01:12 AM

We're having our bi-monthly RSuite Online User Group meeting on Tuesday, July 26. This meeting is for all key end users, IT developers, and executives who have implemented RSuite.

In this edition we'll be:

  • Discussing our upcoming RSuite User Conference.
  • Showing creative new functionality regarding XML Editing in RSuite
  • Answering questions and chatting about pertinent topics in our regular "Users Helping Users" segment.

You can request a seat by visiting


Topics: RSuite, RSuite CMS, XML, CK Editor, oXygen XML Editor, FontoXML, XML Editors, #RSuiteUC16

RSI's 5-minute-series: Learn DITA in 5 minutes

Posted by Christopher Hill on Oct 24, 2013 8:59:00 AM

5-minute-series; Learn Dita in 5 MinutesLately I’ve found myself doing more discussion of DITA, so it is time for another in the 5-minute-series. If you are new to XML it might be helpful to start with the previous two posts on XML and Schemas before continuing.

In the previous posts I discussed how XML isn’t a specific language, but is instead a set of rules governing the syntax of languages that may be invented. The invention of XML came out of a need to be able to describe content. Word processors and desktop publishers mostly focused on the formatting of content. When you create new content in these tools you do so as a part of the layout and formatting process. With XML, you instead try to describe what the content you are entering is, for example a paragraph, a chapter, a book, an article, a caption or whatever. 

XML provides a common syntax for creating languages to describe your content, but does not specify the actual grammar. As described in detail in the previous post in this series, XML Schemas or DTDs are used to specify the exact labels and grammar of a particular type of XML.

While you can invent your own labels and grammar based upon XML, doing so means that unless others adopt your format, you will have to customize editing tools to understand your particular vocabulary.

Instead of always creating a vocabulary from scratch, many users of XML instead adopt a shared standard. Standards exist to represent most any data you can think of, whether it be recipes, musical scores, articles, chapters, books or anything else. These standards can be shared, and tools can be created to create, edit, manage and format based on the standard. If a community exists around my particular flavor of XML, we can share tools and techniques that can mean reduced effort required to deploy content solutions.

DITA, an acronym for Darwin Information Typing Architecture, is an XML language that is extensible and can be adapted to a range of uses. DITA is based on the concept of topics. A topic is a unit of information that typically can be read in isolation or inserted into a larger document. In order to stream together topics, DITA uses the concept of a map file. A map file is simply an XML file that acts as a sort of table of contents stringing together a series of topic files.

The term “topic” is generic. DITA allows, however, the generic topic to be adapted to represent more specific structures. The basic DITA specification includes Concept, Task and Reference. These content units are more specific versions of the generic topic. They can be handled with special rules if you want. But if you don’t haven special rules, they can be also treated more generically as topics.

Benefits of a common vocabulary 

Having a common vocabulary means that users of the vocabulary can share information with each other and share tools and code used to handle the content. For example, if you use a DITA-based format, there are a number of editing tools that can be used to edit your content. Tools used to process the content can also be shared. For example, DITA includes the code and stylesheets needed to create PDF, HTML and other output formats, and the community is constantly evolving. New formats may appear and other DITA-based solutions can take advantage of the tools to support the new format without needing to modify their existing processes.

For DITA, the community provides the DITA Open Toolkit. This toolkit includes a variety of transformations that can take DITA content and render it in HTML, PDF, and other formats. It also provides an extensible architecture. If you have a customized version of DITA, you can create a plug-in that can enable DITA solutions to handle the specific requirements of your customizations. Toolkit plugins can be used to configure editing tools, extend the rules of DITA, or modify the included stylesheets used to render content so that they can account for a most specific vocabulary adapted from the base DITA stylesheets. Any DITA tool can process content even if it is based on proprietary extensions because all of those proprietary extensions are mapped to more generic DITA structures. So if I use a DITA-based vocabulary that defines a “chapter,” systems that do no understand “chapter” can always treat the encoded content as a more generic “topic.” 

So while XML is a set of rules for creating a particular language to encode your content, DITA is a particular language that was designed to be able to be extended to more specific uses that still share a common grammar. DITA provides a base set of stylesheets for rendering your content in a variety of specific formats. Many XML tools exist to process any DITA-based document, and most provide extension points so that you can adapt the tool to a more specific DITA-based language without having to start from scratch. 

Tools to edit DITA documents can edit any vocabulary derived from DITA without modification, and can be extended to support more specific vocabulary structures if desired. At RSI Content Solutions we have content management systems with support for DITA that provides a range of features that make it much quicker to deploy a DITA solution without starting from scratch. Our solutions allow editing, transformation, as well as the ability to reuse content in different contexts if needed. So while XML is a set of rules governing the structure of an infinite variety of languages, DITA is a topic-based XML language used for representing content. Although you can use DITA without any modifications, many organizations wish to encode content in less generic manner. DITA has the advantage of allowing more specific content structures to be derived from the existing generic structures if needed. This means that if you need to create an XML vocabulary you aren’t starting from scratch and you are providing a fallback mechanism for systems not aware of the specifics of your particular vocabulary.

Topics: DITA, XML, 5-minute-series

Core Critical Publishing Technology: XML-First, XML-Early, XML-Hidden

Posted by Marianne Calihanna on May 8, 2013 11:17:00 AM

XML is the foundation that enables multichannel publishing for publishers and media companies, but authors, editors, and reviewers have struggled to work with it effectively. At last, the tools and technology have caught up. Today organizations can choose whether they want XML-first, XML-early, or XML-late workflows, and even whether they want the XML hidden from their users altogether. Christopher Hill, vice president of product management, recently spoke at the MarkLogic World conference and demonstrated RSuite CMS, the industry’s first enterprise content management system powered by MarkLogic. In this presentation he details the benefits of XML-first, XML-early, and XML-hidden workflows in various publishing scenarios from leading publishing organizations using RSuite CMS.
Core critical publishing technology

When you'd like to learn more about RSuite CMS and how it can support your publishing organization, contact us to arrange a custom demo:

Schedule Demo

Topics: RSuite CMS, XML, Publishing Workflows

Learn to Upgrade Valuable Media and Business Content Without Draining Your Budget

Posted by Marianne Calihanna on Apr 19, 2012 3:39:00 PM

RSI Content SolutionsWebinar Series on the Business Case for Digital Content

RSI Content Solutions and Data Conversion Laboratory are kicking off a 6-part webinar series next week that will address the many myths associated with the world of XML, CMS, and eBooks.

The six part webinar series called ‘Reality Check’ features experts in content management and publishing who lead the series and detail how to manage information and transform content to work within eBooks, browsers, and mobile platforms.

Following are the webinars in this series. You can read more here.

  • April 26 | Truth of Digital Revenue Streams
    Panelist: Darrell W. Gunter, CEO, Gunter Media Group
    Having worked with hundreds of publishing professionals during the past 10 years, we've observed organizations that implement a strategic content management initiative and converted backlist titles into XML are the ones who are seeing digital revenue exceed print. Join this free webinar and hear the truths about what your organization can do recognize true digital revenue.
  • May 9 | Truth About Automation
    For publishers and media companies, automating editorial and production tasks is necessary to keep pace with customer consumption as well as the competition. While many knowledge workers view automation as a threat to job security and an impediment to editorial quality, this webinar illustrates the truths around automating common editorial and production tasks. Indeed automation can free staff to focus on better content development.
  • June 9 | Truth About ROI
    Panelist: Christopher Hill, VP Product Development, RSI Content Solutions
    Publishers understand that content management and data conversion is a pivotal piece in today's publishing environment. Yet budgeting for these initiatives can quickly scale to the point where executives question why they should stray from the status quo. In this free webinar, DCL and RSI Content Solutions, will lead a panel of publishing professionals who will discuss how they made their business case and received enthusiastic executive buy-in for content management and data conversion in their organizations.
  • August 29 | Truth About DIY CMS and Conversion
    Panelist: Pat Sabosik, Elm City Consulting
    While using internal resources to develop a homegrown content management tool or convert your backlist to XML sounds like a cost-effective approach, the reality is that 82% or IT projects fail. This webinar focuses on the real concerns you need to address so that your organization can make educated decisions based on truths and not what simply seems will work.
  • September 19 | Truth About Quality
    Panelists: Mike Edson and John Corkery, The DETI Group
    The premise of all publishing organizations is to provide quality content in a format that customers desire. Ask any copy editor about house style and you can anticipate a lengthy and thoughtful response. Authors too expect nothing but perfection when transforming intellectual property into a print or digital product. So how do successful publishing organizations blend automation into workflows without sacrificing quality?
  • November 19 | The Truth From the Publishers' Perspective
    Panelists: Barry Bealer, CEO, RSI Content Solutions and Mark Gross, CEO, Data Conversion Laboratory
    Throughout the year, DCL and RSI Content Solutions have polled a large number of publishing and media executives to understand where they are in terms of strategic XML content management. We’ve asked tough questions around true revenue numbers, quality-control issues, content automation, and ROI. In this webinar series join CEOs Barry Bealer, RSI Content Solutions and Mark Gross, Data Conversion Laboratory who share not only the results of our 10-month polling but also their views on what the metrics mean.

Topics: content management for publishers, Webinar, CMS for publishers, CMS, XML

Automated Content Transformation: XML to InDesign and Beyond

Posted by Marianne Calihanna on Apr 17, 2012 4:42:00 PM

No matter the allure and depth of digital products in today's world, publishers still need to print well-designed pages. At the last RSuite User Conference, we asked attendees what feature they found most useful. The ease of content transformations ranked high on the list of responses. Check out the following description about how RSuite CMS can automate content transformation to InDesign, HTML, EPUB, and PDF.

Want to learn how publishers are using RSuite to automate XML transformation to InDesign? Download our latest white paper: DITA For Publishers: How Successful Publishers Deliver Content

Click me

Topics: content management for publishers, content transformations, XML

Dan Dube from RSuite Cloud will Speak at Online Information 2011

Posted by Sarah Silveri on Nov 28, 2011 12:29:00 PM

london online 2011 small resized 600On Thursday, December 1st at 1:10pm in Theatre 4, near the XML Pavilion, Dan Dube, EVP, Cloud Solutions at Really Strategies, Inc. will speak at Online Information 2011, the largest event dedicated to the Information Industry.

Mr. Dube will present, “Success Stories: automated production of print and ebooks, featuring RSuite Cloud.” The presentation highlights RSuite Cloud, an award-winning, end-to-end publishing solution for book publishers.

“For book publishers, delivering content to devices like the iPad and Kindle is not an option, it's imperative,” stated Mr. Dube.  “With so many technology and outsourcing choices available to publishers, navigating the options often results in uncertainty and confusion.  In my presentation, I’ll share real-world case studies of book publishers who are successfully using XML-based technology to automatically produce multi-channel output to print and ebooks."

Topics: ebooks, RSuite Cloud, XML

Book Expo America 2011: RSuite Cloud will be there. Will you?

Posted by Sarah Silveri on May 2, 2011 8:00:00 AM

Book Expo America, BEA 2011, Really Strategies, RSuite Cloud, Digital Zone, Publishing event, NYCBook Expo America is coming up in just a few weeks and we’re so excited to be part of it. RSuite Cloud will be representing Really Strategies, Inc.

At Book Expo America, Book publishers can see RSuite Cloud and understand how it is helping some of the leading publishers:

  • Increase revenues with faster time-to-market
  • Produce pages in minutes not weeks
  • Simultaneously output to print, web, ipad, and more
  • Automatically Convert Word files to XML
  • Translate and publish to more than 200 languages

People from all  around the world within the publishing industry will be there and we want to see you between May 24th and May 26th. We’ll be in the IDPF Digital Zone at Kiosk #2309. Schedule your time with us today!

Tweet about us at #BookExpo using the #RSuiteCloud hashtag.

Topics: content management for publishers, content management, CMS for publishers, publishing, CMS, XML

The American Institute of Physics Selects RSuite Content Management System to Modernize its Publishing Program

Posted by Marianne Calihanna on Apr 26, 2011 10:10:00 AM

First law of physicsReally Strategies is pleased to add The American Institute of Physics to its growing list of STM and journal publishers who are using RSuite to manage content.

"The American Institute of Physics selected RSuite to manage and store its vast amount of content, including metadata, full-text XML, PDFs, images, and multimedia assets,” stated Evan Owens, AIP’s Chief Information Officer, Publishing. “We see RSuite as a key technology that will allow us to implement industrial strength content management, including version control and automated content distribution, to best support our digital publishing program."

Click here to read the official press release.

Topics: content management, publishing, XML, STM publishers, journal publishers

Make XML a solution (instead of a challenge)

Posted by Marianne Calihanna on Mar 30, 2011 9:17:00 AM

Address the top 3 digital publishing challenges with XML. Most content producers no longer rely on a single channel for content distribution. To efficiently and competitively expand digital offerings, publishers are looking to re-use existing content to create new products. Whether creating new content “mash-ups” out of existing content repositories or ensuring consistency between publications that share information, content re-use is a powerful capability allowing content to achieve its maximum value. XML provides an anchor for coping with a diverse and expanding number of channels.

Download our latest white paper, "Make XML a Solution Instead of a Challenge."

XML - content management for publishers

Topics: content management, CMS for publishers, publishing, CMS, XML

Comment below