Implementing Automatic Content Validation

Implementing Automatic Content Validation

Content validation is a mechanism that helps editors ensure that their work results are consistent and complete. It can complement writing style guides, publishing workflows, agreements on content and its structure, etc. More and more, it is becoming a requirement in enterprises taking advantage of content management and publishing systems.

With Scrivito, validations can be set up for all CMS object and widget types to indicate that page content needs to meet specific criteria. For notifying editors, three severity levels are available, errors (red), warnings (orange), and informational messages (blue, sometimes also called hints).

Errors prevent the working copy concerned from being published, while warnings and informational messages don’t.

  • Validations are only applied to pages that have been changed in the working copy. This prevents pages from becoming invalid even though they are identical to the corresponding pages in the published content.

Validation examples

When developing validations, keep the rules simple and easy to understand. Make it clear in the notifications whether an issue is an error that “must” be fixed, or just a warning or hint that “should” or “can” be observed. Here are a couple of examples:

The meta description of a page must neither be empty nor the same as the title.The meta description (also known as page description) is displayed on search engines’ results pages. It is crucial to SEO that this description is present and reflects what the respective page is about.
Every article must include at least one image.Visitors appreciate images if they make reading the article more fun or support understanding. This requirement could be limited to specific page types, e.g. blog posts.
Text must have a specific format.Sometimes, content (especially structured content like contacts, addresses, etc.) includes data expected to follow specific formatting rules, e.g. telephone numbers, email addresses, etc. It is helpful to reject such data if it violates those rules.
A time span must be valid, i.e. the end date not before the start date.When working with pages that are meant to be only valid for a specific period of time, we want their start and end dates to make sense, so checking them is a good idea.
Columns should not be left empty.One could reason that columns should not be used (as a workaround for missing CSS) to generate horizontal space. You could have a warning displayed if any column of a column widget is empty.

Let’s start and implement these requirements as validations. We’ll use the Scrivito Example App as our basis.

Provide a meta description

Validations can be defined in the editing configuration of a CMS object or widget class. They can be provided as callbacks for such a class as a whole (class-based), or for their individual attributes (attribute-based).

In this example of an attribute-based validation, we ensure that the meta description is not forgotten and hasn’t been copied over from the page title.

In the Example App, editors can provide the meta description of a page via the “Metadata” tab in the page properties. To check if it’s empty or just a copy of the page title, add the following validation to the editing configuration of the Page class:

The attribute that keeps the meta description, metaDataDescription, is taken (together with other metadata attributes) from a constant, metadataAttributes, defined in “_metadataDescription.js” in the parent directory.

In the validations array, attribute-based validations are represented as an array made up of the attribute name and the callback function.

At least one image in articles

In contrast to the “Provide a page description” validation above, the validation we want here is class-based because the requirement is met if any of the article’s widgetlist attributes – and not a specific one – contains at least one image. Like we did above, let’s add a callback for this to the editing configuration of the Example App’s Page class:

With class-based validations, there’s just the callback function provided directly as an element of the validations array.

Now, in the Example App, in editing mode, open a page, e.g. “Jobs” from the “About” menu, and take a look at the notifi­cation icon on the sidebar: If it doesn’t indicate an error on the page, you’ve either left the page unchanged in the working copy (see the note on the screenshot), or added an image manually.

After clicking the icon to open the “Notifi­cations” panel, the error message we specified in our above validation definition should be visible.

You will have noticed that the error is displayed even though the page includes several images. The header image, as well as the images on the social cards properties tab, don’t count because they aren’t widgets, but what about the carousel further down on the page? We could demand that the page includes at least one ImageWidget or CarouselWidget by changing the filter function accordingly:

Finally, if you would like to apply this validation to other page classes as well, simply place the callback into a separate file, import the function into the corresponding editing configuration, and call it in the validation definition.

Text must have a specific format

You’ve probably come across UK postal codes already. If not, they’re a mixture of five to seven digits and capital letters, optionally with a space between the outward and the inward code, like SW1A 1AA for Buckingham Palace. To prevent invalid codes from being published, you would need a rather lengthy regular expression, but, as we only want to illustrate the working principle, we’ll content ourselves with something more simple: the format of prices.

The Example App includes a PricingWidget with attributes for the prices of three plans as well as one for the currency. For a maximum of flexibility, these attributes are strings, so let’s ensure that the prices are formatted like customers expect them to be, with a thousands separator and decimal sign in accordance with the currency. To keep it simple, we will only validate the price format of the largest plan, largePlanPrice, and check the separators based on whether the currency is $ nor not.

Validation works for widgets exactly the same as for CMS objects, meaning that they can be applied to widget class instances as a whole or, as in this case, to one of its attributes. Here we go:

Like in the other examples above, in case of a failure, the validation function returns the error message as a string. You can also return an array of strings if several issues were found and you want them to be recognized individually by means of dedicated notifications.

A time span must be valid

There are quite a few use cases in which dates play a role: Press releases are often made public on a fixed date, the validity of product and job offers is limited, etc.

In the Example App, job offers are equipped with two date attributes, datePosted and validThrough. Even though both dates are clearly visible on Job pages, adjusting them could easily be forgotten after copying such a page with the intention to create a similar new offer. So let’s make sure that at least the validThrough date comes after datePosted:

We are assuming here that validThrough is the error-causing value. However, the editor will be notified about this value being wrong even if datePosted should be adjusted instead. In your particular implementation you may wish to swap the attributes and adapt the comparison, or to validate both attributes, or even take account of the current date in the check.

Especially with date values, just keep in mind that sometimes page content needs to be fixed after it was published so take care to not introduce conditions to pages or widgets that would prevent future publishing.

Columns should not be left empty

As an example of how to check the contents of a structure widget and generate a warning if something should be improved, our last use case deals with column container widgets (simply called “Columns” in the widget browser). The Example App’s ColumnContainerWidget has a columns attribute, a widgetlist that gets initialized with three ColumnWidgets each of which is again made up of a widgetlist attribute: content. Let’s warn the editor if content is empty in any of the columns:

The code iterates the columns and returns an object containing a message and a severity key if an empty content attribute is found. So, instead of directly returning just a string (the message) or an array (of messages) to produce an error, we are returning an object to be able to additionally specify the severity, which can be error (the default), warning, or info.

Final words

Validations are an easy-to-implement measure against flawed content getting published. Just keep in mind that editors should always remain able to set things straight, so provide them with clear error messages, useful examples, etc.

In addition to implementing class or attribute-based validation callbacks like the ones we’ve shown here, Scrivito also lets you declare constraints and define a validation callback for having them checked using a third-party library such as Validate.js. See the API documentation for details. Generally speaking, Scrivito’s built-in mechanism is a bit more flexible because it lets you implement whatever you deem appropriate as a validation, making it possible to also address dependencies between attributes. Constraints, in contrast, limit you to a given (library-specific) set of checks, but they are easier to apply.