New in 1.10.0

extractText(obj, options)

Extracts text from a given Obj and returns it as a string.

Copy
const extractedText = Scrivito.extractText(obj);

extractText provides the pure-text version of those Obj’s attributes that are specified as extractTextAttributes in its model class. To achieve this, the function removes HTML tags and newlines from the attribute values. Here are some use cases:

  • Displaying a preview snippet, for example the first 300 characters of a
    • page in a search results list,
    • blog post in a blog post overview,
    • text preview of a PDF file (e.g. in a search results list).
  • Providing metadata for a page, for example by
    • using extracted text in og:description or twitter:description meta tags in the header
    • using widgets as a content source in a Schema.org JobPosting.
  • Calculating the estimated reading time of a blog post based on the word count.

Params

  • obj (Obj) – The Obj instance from which text should be extracted.
  • options (Object):
    • length (Number) – The maximum length of the return value. Limiting the length to a reasonable value (e.g. 300 characters) may speed up the text extraction process. Default: 1,000,000,000

Returns

String – the values of the Obj’s extractTextAttributes as a single string, stripped of HTML tags and newlines.

Remarks

This method is loadable, meaning that it is able to return partial results and indicate to Scrivito.load or Scrivito.connect that it needs to be executed again at a later point in time.

Attributes such as title should not be included in the extractTextAttributes list because search result lists or blog post overviews most likely display the individual titles anyway.

See also

  • extractTextAttributes option of Scrivito.provideObjClass
  • extractTextAttributes option of Scrivito.provideWidgetClass

Examples

Prepare text extraction from instances of a simple Page class:

Copy
Scrivito.provideObjClass("Page", {
  attributes: { body: "widgetlist" },
  extractTextAttributes: ["body"]
});

Extract text from a simple HeadlineWidget contained in a widgetlist attribute:

Copy
// Create the HeadlineWidget
Scrivito.provideWidgetClass("HeadlineWidget", {
  attributes: { headline: "string" },
  extractTextAttributes: ["headline"]
});

// Create a Page instance with a HeadlineWidget
const pageWithHeadlineWidget = Scrivito.Obj.create({
  _objClass: "Page",
  body: [
    new Scrivito.Widget({ _objClass: "HeadlineWidget", headline: "Some facts" })
  ]
});

Scrivito.extractText(pageWithHeadlineWidget);
// => "Some facts"

Extract text from several attributes of a widget:

Copy
// Create a FactWidget class
Scrivito.provideWidgetClass("FactWidget", {
  attributes: {
    key: "string",
    value: "string"
  },
  extractTextAttributes: ["key", "value"]
});

// Create a Page instance with a FactWidget
const pageWithFactWidget = Scrivito.Obj.create({
  _objClass: "Page",
  body: [
    new Scrivito.Widget({
      _objClass: "FactWidget",
      key: "29",
      value: "Number of Widgets included in the Example App"
    })
  ]
});

Scrivito.extractText(pageWithFactWidget);
// => "29 Number of Widgets included in the Example App"

Make use of the length option:

Copy
Scrivito.extractText(pageWithFactWidget, { length: 40 })
// => "29 Number of Widgets included in the Exa"

Extract text from several widgets:

Copy
// Create a Page instance containing two widgets
const pageWithHeadlineAndFactWidget = Scrivito.Obj.create({
  _objClass: "Page",
  body: [
    new Scrivito.Widget({ _objClass: "HeadlineWidget", headline: "Some facts" }),
    new Scrivito.Widget({
      _objClass: "FactWidget",
      key: "29",
      value: "Number of Widgets included in the Example App"
    })
  ]
});

Scrivito.extractText(pageWithHeadlineAndFactWidget);
// => "Some facts 29 Number of Widgets included in the Example App"

Extract text from an html attribute:

Copy
// Provide a TextWidget class containing an html attribute
Scrivito.provideWidgetClass("TextWidget", {
  attributes: { text: "html" },
  extractTextAttributes: ["text"]
});


// Create a Page instance with a TextWidget
const pageTextWidget = Scrivito.Obj.create({
  _objClass: "Page",
  body: [
    new Scrivito.Widget({
      _objClass: "TextWidget",
      text: "<h2>Hello &amp; welcome to the <i>Scrivito</i> show.</h2>"
    })
  ]
});


Scrivito.extractText(pageTextWidget);
// => "Hello & welcome to the Scrivito show."

Extract text from a PDF file and limit its length to the first 100 characters:

Copy
// Provide a Download widget class
Scrivito.provideWidgetClass("Download", {
  attributes: { blob: "binary" },
  extractTextAttributes: ["blob:text"]
});

// Extract the text of a PDF file that has been uploaded (pdfFile) 

Scrivito.extractText(pdfFile, { length: 100 });
// => "White Paper Are You Asking Your CMS Vendor the Right Questions? Scrivito.com Scrivito is pro"