This example will walk you through importing an existing webpage into Prismic. While it assumes your data can be exported as HTML pages, most of the code should be applicable to any format, as long as you can parse it into JavaScript datastructures used by the provided functions. The showcased code supports importing assets, resolving links between documents and multi-language documents and as such it should be applicable to a broad range of webpages.
We will use kramdown-prismic
Ruby gem
(conventional name for Ruby packages) to convert HTML into Prismic rich text format. To run
it, you will need Ruby 2.6 or later – you can check if you have the required version
installed by running ruby --version
in your shell. If you don't, you can use a tool like
asdf
or rbenv
to
install it.
Currently, support for the API V2 format used by the Migration API is available only as a
pull request. If you're not familiar
with the Ruby ecosystem, it might not be immediately obvious how to install it – one option
you can try is the specific_install
RubyGems plugin (you will also need to have git
available):
# In case of permission issues with a global Ruby install, you will have to add
# `sudo` before each use of the `gem` command
gem install specific_install
# Installs the gem version from the branch
gem specific_install -l "http://github.com/prismicio/kramdown-prismic.git" -b "prismic-v2-format"
# Removes the gem, if you don't need it anymore
gem uninstall kramdown-prismic
There are other options, such as using bundler
with a Gemfile
and
calling the command with bundle exec
or checking out the source, building the gem with
gem build
and installing the build result. Pick one that works best for you.
Our example will be using a repeatable custom type named story
with the following fields:
Field name | Field type |
---|---|
uid | uid |
chapterIllustration | image |
chapterTitle | title |
previousChapter | content relationship |
nextChapter | content relationship |
contents | rich text |
Please create it if you want to follow along using our example data. For you convenience
you can use the JSON file provided at examples/html/customtypes/story/index.json
with
your Slice Machine setup.
You also need to have en-us
and fr-fr
locales enabled, because the example script
imports documents into those two locales. If you use legacy custom type editor, you can
copy the value of the json
property in the Slice Machine file into the editor, it should
be compatible with it.
Exporting data from an existing website depends a lot on the tools you used to create it and as such explaining how to prepare data in your specific case is out of scope of this tutorial.
In the interest of providing a working example, we will use a simple HTML website from the
examples/html
directory. If your current solution supports exporting your documents as
HTML pages, the example will be directly useful for you. Otherwise you will have to adapt
it to your export data format.
Once you have exported your website, you have to parse it into data structures you can then use to re-create the webpage as Prismic documents.
This process is also very specific to how the website is built. In the case of the example
pages provided, an appropriate solution is to use a HTML parser and extract data from the
document structure – a popular option is the cheerio
library,
wih a familiar jQuery-like interface. For other export formats, you will have to choose
other libraries that are appropriate for that format.
The HTML import script is located at src/examples/html/index.mjs
and you can run it with
yarn examples:html
or npm run examples:html
if you want to test how it works.
Here is how the script code itself looks:
// Generic import processing functions provided for use in examples.
import {
findDocuments, mapDocuments, syncWithMigrationRelease,
findAssets, syncWithMediaLibrary, resolveReferences,
} from "../../import/index.mjs";
// Example-specific code for processing the provided HTML pages into
// Prismic documents of the previously mentioned `story` custom type.
import story from "./story.mjs";
const documents = await findDocuments("examples/html/**/*.html")
.then(mapDocuments(story.fromHtml))
.then(syncWithMigrationRelease({ includeFields: false, onlyLanguages: [ "en-us" ] }))
.then(assignAlternateLanguages({ mainLanguage: "en-us" }))
.then(syncWithMigrationRelease({ includeFields: false }))
.then(findAssets(story.findAssets))
.then(syncWithMediaLibrary)
.then(resolveReferences(story.resolveReferences))
.then(syncWithMigrationRelease());
This is not pseudocode, but actual code taken from the index.mjs
file – the script is
structured as a series of processing step building off each other, transforming the data
from the input format to fully imported documents, step by step.
While what each step does should be fairly self-explanatory from the function names, you
might wonder why we are doing syncWithMigrationRelease
multiple times. This is necessary
because some operations require IDs of existing documents:
-
setting an alternate language is possible only when creating documents – to solve that issue, we first have to import the documents for the main language (using the
onlyLanguages: [ "en-us" ]
parameter), and only then use their IDs as alternative language IDs when importing the documents in other languages, -
resolving references to other documents in document fields also requires knowing their IDs and you can't upload a document with incorrect field values – to solve that issue, we first import the documents without their contents (using the
includeFields
parameter), then use resulting IDs to resolve the document references and finally update the documents with the finalised contents.
You can also notice that there are dumpState
and readState
helpers available – you can
insert them between the processing steps to save and restore the processing state. It can be
useful if you want to see what's happening without resorting to a debugger or even resume
work from a latter step if something failed by readState
ing the last successful step.
The first step is finding all the files we want to import into Prismic. In case of files
already present on the computer – like our example – the simplest solution would be to use
a use a globbing library, which will find all the files matching a specified pattern for
you, without having to manually traverse the filesystem using node:fs
. A popular option
for that is the appropriately named glob
package.
Here's an excerpt from src/import/documents.mjs
, showing how we discover files:
import { pathToFileURL } from "node:url";
import { glob } from "glob";
export const findDocuments = async documentGlob => {
const paths = await glob(documentGlob);
const documents = paths.map(path => ({ path: pathToFileURL(path) }));
return { documents };
};
When you run it against the example directory, the function will discover all HTML files matching the glob pattern – you can consult the library's readme for a list of supported glob patterns – and return them as an object representing the processing state.
A good way to experiment with how it works is to start a REPL using the node
command and
enter the code there:
> var { findDocuments } = await import("./src/import/documents.mjs");
undefined
> processingState = findDocuments("examples/html/**/*.html");
Promise { ... }
> await processingState
{
documents: [
{ path: Url { ... } },
{ path: Url { ... } },
{ path: Url { ... } },
{ path: Url { ... } },
{ path: Url { ... } },
{ path: Url { ... } },
{ path: Url { ... } },
{ path: Url { ... } },
{ path: Url { ... } },
{ path: Url { ... } }
]
}
Two choices might bear explanation – the first is the usage of the URL
class to represent
file paths. It's an arguably more convenient way to resolve paths and tell whether links
are to local pages in the export:
Explanation on URL resolve behaviour
// The URL path is always absolute, thus unambiguous in what it refers to
> processingState.documents[0].path.href
"file:///home/user/project/examples/html/fr/good-end.html"
// It's very easy to tell if it's a local file or a web link by looking at the protocol
> processingState.documents[0].path.protocol
"file:"
// If we use pass current file's URL as the `base` URL (the second argument)
// Relative links will keep the `file:` protocol of the file path
> new URL("relative/file.ext", "file:///some/path/that/is/absolute").href
"file:///some/path/that/is/relative/file.ext"
// External links will completely replace the base URL and use link's protocol
// We can use that to differentiate them from local assets or documents
> new URL("https:///some.web/page", "file:///some/path/that/is/absolute").href
"https:///some.web/page"
The other choice is that – instead of returning the document array directly – we decided
to nest the result under a documents
property of an object. We will use this object to
represent the current processing state, so we can enrich it with additional information
(for example, a mapping from file path to Prismic document ID) as we continue to process
the documents further.
It is usually helpful to approach a problem by decomposing it into smaller pieces. As such, we will first write a function that can process a single document, and then apply it to all documents.
Here's an excerpt from src/examples/html/story.mjs
to show how such a function could
be structured at a high level:
import fs from "node:fs/promises";
import cheerio from "cheerio";
export const fromHtml = async document => {
// Read the HTML page and parse it
const pageContents = await fs.readFile(document.path, { encoding: "utf8" });
const $ = cheerio.load(pageContents);
// Extract the relevant data from the parsed page
const title = $("title").text();
// ...
// Use page filename title as document slug, we will use that value
// to associate alternate language versions by
const filename = path.basename(document.path.pathname, ".html");
const uid = slugify(`story-${ filename }`, { strict: true, lower: true });
// ...
// And add this information to the document
// For simplicity we mirror the format of the `POST /documents` Migration API endpoint:
// https://prismic.io/docs/migration-api-technical-reference#post:
return {
...document,
title,
uid,
}
}
Feel free to refer to the aforementioned file should you want to see the full implementation. It basically does more of the same and should be fairly well-commented.
Now, we need to apply this mapping function to all the documents. To accomplish that we
will introduce a mapDocuments
helper:
// Close over a (potentially async, like `fromHtml` above) function mapping
// a single document
> mapDocuments = mapping =>
// And provide a function applying the it to all the documents
// in the processing state
async processingState => {
// Since a mapping can be potentially asynchronous, we collect
// the results using `Promise.all`
const documents = await Promise.all(
processingState.documents.map(
async document => ({ ...document, ...await mapping(document) })
)
);
// Collect mapped documents into a `documentMap` keyed by their paths
const documentMap = new Map(
documents.map(document => [ document.path.toString(), document ])
);
return {
...processingState,
documents,
documentMap,
}
};
[Function: mapDocuments]
With those two functions in place, we can now test how the mapping code works:
// We will import the existing implementations for convenience
> var { fromHtml: storyFromHtml } = await import("./src/examples/html/story.mjs");
undefined
// Update the processing state with document fields
> processingState = processingState.then(mapDocuments(storyFromHtml))
Promise { ... }
// And see that it worked
> await processingState
{
documents: [
{
path: URL { ... },
type: "story",
title: "La Geste de Foo Bar – Good End",
uid: "story-good-end",
lang: "fr-fr",
data: {
chapterTitle: "Good End!",
chapterIllustration: URL { ... },
previousChapter: undefined,
nextChapter: URL { ... },
contents: [
{
type: "paragraph",
text: "Hark, for you have reached the good end!",
spans: [
{ type: "em", start: 31, end: 39 }
]
},
...
]
}
},
...
],
documentMap: Map(10) {
"file:///home/user/project/.../examples/html/fr/good-end.html" => {
... // the same document as above
},
...
}
}
As you can see, our barebones { path }
document representation was enriched with
the field information we extracted from the HTML page and the documents were indexed
into a map, so we can easily get a corresponding Prismic document by the path of it's
source HTML file.
Prismic expects you to provide a formatted test in the Prismic rich text format. When migrating documents, it can be often useful to traverse the rich text content – for example to find all referenced assets or remove unwanted elements.
To facilitate that, we provide a mapRichText
helper, that you can use to transform
the rich text – you can read the full implementation in src/import/content.mjs
,
but at a high level it will iterate over all the elements, apply a span
mapping
function on them, and then apply the element
mapping function on the resulting
element with updated spans.
Explanation on how to use mapRichText
Depending on what you return from the mapping functions you can accomplish different results. You can of course return a single element or span and it will be replaced in it's containing array. But you can also do other things.
For example, if you return undefined
or null
, the currently processed element or
span will be removed from it's containing array:
> var { mapRichText } = await import ("./src/import/content.mjs");
undefined
// Remove unwanted elements or spans
> removeEmSpans = mapRichText(
{ span: span => span.type == "em" ? undefined : span }
)
[Function (anonymous)]
// Documents have spans
> (await processingState).documents[0].data.contents.map(({ spans }) => spans)
[
[ { type: "em", start: 31, end: 39 } ],
[],
[
{ type: "em", start: 62, end: 73 },
{ type: "em", start: 215, end: 224 }
]
]
// And now they don't
> removeEmSpans(
(await processingState).documents[0].data.contents).map(({ spans }) => spans
)
[ [], [], [] ]
If you return an array of one or more elements or spans, they will be spliced into the containing array:
// Add a paragraph after an image
> complimentImage = mapRichText({
element: (element => element.type == "image"
? [ element, { type: "paragraph",
text: "That's a cool image!" } ]
: element)
})
[Function (anonymous)]
// Now everybody is sure to know it's a cool image
> complimentImage((await processingState).documents[1].data.contents)
[
...
{
type: "image",
alt: null,
url: "../assets/images/magic-battle.png"
},
{
type: "paragraph",
text: "That's a cool image!"
},
...
]
You can even decide to return non-element result if it's useful for you – for example you can leverage it to traverse the rich text and collect information you want, such as all URLs referenced in the rich text content:
> var { uniq } = await import ("lodash-es");
undefined
> findAllUrls = mapRichText({
// Find links in a span
span: span => span.type == "hyperlink"
? span.data.url
: undefined,
// Find links in an element and concatenate all links
// previously found in the element's spans
element: element => [
element.type == "image"
? element.url
: undefined,
...element.spans ?? []
],
})
[Function (anonymous)]
// These are all the referenced URL in the document
> uniq(findAllUrls((await processingState).documents[1].data.contents))
[
"bad-end.html",
"good-end.html",
"../assets/images/magic-battle.png"
]
That last example is something that will come handy in a section or two.
At this point, it might be a good idea to run our documents through the Migration API. As we mentioned in the overview, we can't include field content when importing documents initially due to unresolved asset/document references and we can't import all documents at once, because we require main language document IDs for documents in alternate languages.
To resolve this issue, the function to import the documents will allow you to specify what types of documents you want to sync with this call.
To import the documents we will make requests to the Prismic Migration API. We will use the popular
axios
library to talk to it and axios-rate-limit
to respect the rate limits of the
API. You can consult the src/import/utils.mjs
file to see how to configure the axios
client properly, below we will just import it from that file.
As usual, we will first focus on importing a single document:
> var { migrationApiClient } = await import("./src/import/utils.mjs");
undefined
> var { mapKeys, pick } = await import("lodash-es");
undefined
// Ensures this document is present in the migration release and it's contents are up to date
> syncDocumentWith = ({ includeFields = true } = {}) => async document => {
const requestParams = document.id
? { method: "PUT", url: `/documents/${ document.id }` }
: { method: "POST", url: "/documents" }
// Send only the fields that we know the API expects
const data = pick(document, [
"title", "type", "lang", "uid", "alternate_language_id"
]);
// Include fields only if asked to
data.data = includeFields
? document.data
: {};
// Upsert the document
const response = await migrationApiClient({
...requestParams, data
})
// Add the uploaded document's id to the document metadata
return {
...document,
id: response.data.id,
};
};
[Function: syncDocumentWith]
And then we apply this to all the documents in the processing state:
> syncWithMigrationRelease = (options = {}) => {
const syncDocument = syncDocumentWith(options);
const { onlyLanguages } = options;
return processingState => {
// Create a set of languages we want to sync
const allowedLanguages = new Set(
onlyLanguages ?? processingState.documents.map(({ lang }) => lang)
);
// And sync only documents in those languages
const syncMatchingDocuments = mapDocuments(document =>
allowedLanguages.has(document.lang)
? syncDocument(document)
: document
);
return syncMatchingDocuments(processingState);
};
};
[Function: syncWithMigrationRelease]
// Let's now import all the English language documents
> processingState = processingState.then(
syncWithMigrationRelease({ includeFields: false, onlyLanguages: [ "en-us" ] })
)
Promise { ... }
// This might take a few seconds first time you do this, due to the rate limits
> await processingState
{
documents: [
...
{
path: URL { ... },
type: "story",
title: "The Story of Foo Bar – Good End",
uid: "story-good-end",
lang: "en-us",
data: { ... },
id: "ZhzVZhAAAOMK0PiB"
},
...
],
documentMap: Map(10) {
...
"file:///home/user/project/.../examples/html/en/good-end.html" => {
... // the same document as above
}
}
}
As you can see, the English documents now have Prismic IDs and will be present in the
migration release, if you check it in Prismic UI. We have also stored the mapping between
the document paths and documents in the documentMap
, which will come in handy when we
will want to turn all references to other imported pages to Prismic document links via
their IDs.
As mentioned in the overview, to properly handle alternate language versions, we have
to first separately import documents in your main language (en-us
in our example) and
then update documents in alternate languages to reference the main language document
as their alternate_language_id
before importing them.
To do this, we can use the following function:
> assignAlternateLanguages = ({
mainLanguage,
commonKey = ({ uid }) => uid
} = {}) =>
async processingState => {
// Collect the mapping from the `commonKey`s of main language documents
// to their IDs
const mainDocuments = new Map(
processingState.documents
.filter(({ lang }) => lang == mainLanguage)
.map(document => [ commonKey(document), document.id ])
);
// Add alternate language id to documents in other languages
const withAlternateLanguageIds = mapDocuments(document =>
document.lang != mainLanguage
? {
...document,
alternate_language_id: mainDocuments.get(commonKey(document))
}
: document
);
return withAlternateLanguageIds(processingState);
}
Using this function we can now update the documents with appropriate alternate_language_id
s
and sync them with migration release:
> processingState = processingState.then(assignAlternateLanguages({ mainLanguage: "en-us" }))
Promise { ... }
> processingState = processingState.then(syncWithMigrationRelease({ includeFields: false }))
Promise { ... }
> await processingState
> await processingState
{
documents: [
{
path: URL { ... },
type: "story",
title: "La Geste de Foo Bar – Good End",
uid: "story-good-end",
lang: "fr-fr",
data: { ... },
id: "ZhzVZhAAAOMK0PeF",
alternate_language_id: "ZhzVZhAAAOMK0PiB" // same as English document below
},
...
{
path: URL { ... },
type: "story",
title: "The Story of Foo Bar – Good End",
uid: "story-good-end",
lang: "en-us",
data: { ... },
id: "ZhzVZhAAAOMK0PiB"
},
...
],
...
}
A good next step would be to find and upload all assets referenced in your documents. To do that, we will have to traverse all the documents and return all used assets. While it might sound daunting, it is not very complicated in practice. Once again, let's first focus on a single document case and then see how to apply it to multiple documents.
How the document maps to assets depends entirely on the document structure. In our case, we
have a chapterIllustration
field that is an image and the contents
field can also contain
images nested in it's Prismic rich text. To find assets in the contents
field we can reach
for the mapRichText
helper described above.
> var { mapRichText } = await import ("./src/import/content.mjs");
undefined
// The following will create a mapper that finds all the images in a Prismic rich text
> findAssetsInRichText = mapRichText({
element: element =>
element.type == "image"
? element.url
: undefined,
})
[Function (anonymous)]
> findAssetsInRichText((await processingState).documents[1].data.contents)
[ "../assets/images/magic-battle.png" ]
// Now we map results to URLs for consistency and add the chapter illustration
> findStoryAssets = ({ data, path }) => [
data.chapterIllustration,
...findAssetsInRichText(data.contents).map(src => new URL(src, path)),
]
[Function: findStoryAssets]
> findStoryAssets((await processingState).documents[1])
[
URL { ...}, // for assets/images/dusk.png
URL { ...}, // for assets/images/magic-battle.png
]
Now that we can map a document to assets it contains, the only thing left is to apply this
mapping to all the documents and collect the results in a Map
for future reference, just
like we had done to the uploaded documents:
// Take an `assetMapping` that will return all assets for a given document
> findAssets = assetMapping =>
// And return a function that will apply it to all assets in the processing state
async processingState => {
// Map documents to assets contained therein
const foundAssets = await Promise.all(
processingState.documents.map(assetMapping)
);
// Collect them into a map
const assetMap = new Map(
foundAssets.flatMap(assets => assets.map(url => [ url.toString(), { url } ] ))
);
// And add them to processing state
return { ...processingState, assetMap };
}
[Function: findAssets]
// And now we can find all assets
> processingState = processingState.then(findAssets(findStoryAssets))
Promise { ... }
> await processingState
{
documents: [
...
],
documentMap: Map(10) {
...
},
assetMap: Map(9) {
"file:///home/user/.../assets/images/good-end.png" => { url: URL { ... } },
...
}
}
To upload the discovered assets to the Media Library we will use the Prismic Asset API. We will also axios
to talk to
this API, a pre-configured client is available in src/import/utils.mjs
.
As usual, we will first focus on uploading a single asset:
> var { assetApiClient } = await import("./src/import/utils.mjs")
undefined
// Ensures that this asset is already present in the Asset API and has an ID
> syncAsset = async asset => {
// If the asset already has an id, there's nothing to do
if (asset.id) return asset;
const data = await assetToFormData(asset);
const response = await assetApiClient.postForm("/assets", data)
// Add the Prismic Asset API ID to asset's metadata
return { ...asset, id: response.data.id };
}
[AsyncFunction: syncAsset]
As you can see, we will upload the asset only if we didn't already do that – it's a rather
naive check, as we only verify the presence of the id
property, but it should suffice for
this simple example.
If you need something more robust you can consider calling the API to verify if the ID exists or alternatively hashing the asset and storing the hash in it's notes, so you will not re-upload an identical asset, even when retrying your migration.
Uploading the asset is done by calling the POST /assets
endpoint with the asset attached
as a multipart HTTP form. To accomplish that, we can use the form-data
library:
> var path = await import ("node:path")
undefined
// The `FormData` constructor is a default import
> var { default: FormData } = await import("form-data")
undefined
// Return multi-part form data used to upload an asset to Asset API
> assetToFormData = async asset => {
const existingAsset = await readAssetAsStream(asset);
const filename = path.basename(asset.url.pathname);
const formData = new FormData();
// Asset API requires files to have a filename specified
formData.append("file", existingAsset, { filename });
if (asset.altText) formData.append("alt", asset.altText)
return formData;
}
[AsyncFunction: assetToFormData]
The form-data
library can accept files as a form field in different formats. A very
convenient option is an auto-closeable stream, because that way we don't have to care
whether the asset we're uploading comes from a file you saved as a part of the export
or is an external image you want to now import into the Media Library:
> var fs = await import("node:fs/promises")
undefined
> var axios = await import("axios")
undefined
// Returns an existing asset as a readable auto-closeable stream to use for upload
> readAssetAsStream = async asset => {
// Return a stream for the asset if it's a local file
// Here is where URLs being actual URL objects come in handy
if (asset.url.protocol == "file:") {
return fs.open(asset.url).then(file => file.createReadStream());
}
// Try to fetch it from the internet otherwise
const response = await axios({
method: "GET",
url: asset.url,
responseType: "stream"
});
return response.data;
}
[AsyncFunction: readAssetAsStream]
And with all those pieces in place we can test uploading a single asset:
> var { pathToFileURL } = await import("node:url")
undefined
> await syncAsset({ url: pathToFileURL("./examples/html/assets/images/dawn.png") })
{
url: URL { ... },
id: "Zhg9brOmp5Xm233k"
}
The final step is once again applying this function to all the assets:
// Add asset metadata to documents and ensure they are present in the Media Library
> syncWithMediaLibrary = async processingState => {
const uploadedAssets = [];
// Unfortunately, iterators don't support a `map` method in Node 20 LTS yet,
// so we have to resort to an `for ... of` loop and mutating an array
for (const [key, asset] of processingState.assetMap ?? new Map()) {
uploadedAssets.push(
syncAsset(asset).then(uploaded => [key, uploaded])
);
}
// Collect a mapping from asset path, to an uploaded asset
const assetMap = new Map(await Promise.all(uploadedAssets));
return { ...processingState, assetMap };
}
[AsyncFunction: syncWithMediaLibrary]
// Import the assets
> processingState = processingState.then(syncWithMediaLibrary)
Promise { ... }
// This might also take some time, due to the rate limits
> await processingState
{
documents: [],
documentMap: Map(10) { ... },
assetMap: Map(9) {
"file:///home/user/project/.../assets/images/good-end.png" => {
url: URL { ...},
id: "ZhzcxLOmp5Xm236V"
},
...
}
}
As you can see, the assets now have ids
and we are also indexing our assets into an
assetMap
for future reference. In the next step we will use that information to properly
link assets and documents together.
At this point we now have all the information we need to resolve asset and document references in our document fields.
Let's once again start from the bottom up, by creating a function that will use assetMap
and documentMap
to resolve the passed URL to an ID-based Prismic asset or document
reference:
// Because we will be using this function multiple times per document and passing it
// around, we close over the common data, so the returned function "remembers" them and
// does not need passing them as parameters again
> makeReferenceResolver = ({ assetMap, documentMap, baseUrl }) =>
// Given the reference URL we will return a matching document or asset, if any
referenceUrl => {
if (!referenceUrl) return;
// As discussed in the asset discovery step, this allows us to properly resolve
// the image and link URLs relative to the document we're processing
const url = new URL(referenceUrl, baseUrl).toString();
// If the URL was in the asset map, it was an asset link
if (assetMap.has(url)) {
return { id: assetMap.get(url).id, link_type: "Media" };
}
// If the URL was in the document map, it was a document link
if (documentMap.has(url)) {
return { id: documentMap.get(url).id, link_type: "Document" };
}
}
[Function: makeReferenceResolver]
We can now apply this function to a document, to see how it works in practice:
// Pick a document to test with
> document = (await processingState).documents[1];
{
path: URL { ...},
type: "story",
title: "La Geste de Foo Bar – Dusk",
uid: "story-dusk",
lang: "fr-fr",
data: {
chapterTitle: [ ... ],
chapterIllustration: URL { ... },
previousChapter: URL { ... },
nextChapter: undefined,
contents: [ ... ]
},
id: "ZhzVZhAAAO8K0PiF"
}
// Create the resolver
> resolveReference = makeReferenceResolver(
{ ...(await processingState), baseUrl: document.path }
);
[Function (anonymous)]
// And resolve some references: your IDs may of course vary
> resolveReference(document.data.chapterIllustration)
{ id: "ZhxIBLOmp5Xm236M", link_type: "Media" }
> resolveReference(document.data.previousChapter)
{ id: "ZhxIABAAAJgKzoqK", link_type: "Document" }
Of course this function handles only a single field value, so we need to create a way to apply it to all the fields of the document. Let's first create a helper function that will take a resolver and a field name and return a function that updates a single field in the document:
// Close over the resolver, because we will be generating multiple field resolvers
// for a single document
> resolveFieldWith = resolver =>
// Generate a field resolver for the given field name
fieldName =>
data => {
// Try to resolve the field value to it's Prismic ID, if any
const result = resolver(data[fieldName]);
// If the fields value has a corresponding Prismic asset/document, update
// the field with the metadata provided by the resolver
return result
? { ...data, [fieldName]: result }
: data;
}
[Function: resolveFieldWith]
And here's how it works for one of our documents:
// We use the `resolveReference` from the previous REPL session to create
// a field reference resolver factory
> resolveLinkField = resolveFieldWith(resolveReference)
[Function (anonymous)]
// Then we create reference resolvers for particular fields
> resolveChapterIllustration = resolveLinkField("chapterIllustration")
[Function (anonymous)]
> resolvePreviousChapter = resolveLinkField("previousChapter")
[Function (anonymous)]
> resolveNextChapter = resolveLinkField("nextChapter")
[Function (anonymous)]
// We now test it using the same `document` from the above session
> resolveChapterIllustration(document.data).chapterIllustration
{ id: "ZhxIBLOmp5Xm236M", link_type: "Media" }
> resolvePreviousChapter(document.data).previousChapter
{ id: "ZhxIABAAAJgKzoqK", link_type: "Document" }
> resolveNextChapter(document.data).nextChapter
undefined
As you can see, we can now easily create functions updating a single field in document – but it would be handy to compose them together to handle all the fields of the document. One interesting option is to uses promises.
While promises are usually used for processing data asynchronously (for example with data
over the network), it's not the only way they can be used. You can create an already resolved
promise over a piece of sync data using Promise.resolve
and then call the then
method to
chain data transformation on the subject of the promise.
This method is also flexible, because you can easily slot in an actual asynchronous transformation (for example one that needs to read a file or call an external service to properly resolve the reference) between sync ones easily without changing anything.
Let's see how this could look for a single document:
> resolveStoryReferences = (({ document, resolveLinkField }) =>
Promise.resolve(document.data)
// We apply each field resolver to the data
.then(resolveLinkField("chapterIllustration"))
.then(resolveLinkField("previousChapter"))
.then(resolveLinkField("nextChapter"))
// And then update the document with resolved data
.then(data => ({ ...document, data })))
[Function: resolveStoryReferences]
// And now let's apply it to a document to see if it works
> (await resolveStoryReferences({ document, resolveLinkField })).data
{
chapterTitle: [ ... ],
chapterIllustration: { id: "ZhzcxbOmp5Xm236X", link_type: "Media" },
previousChapter: { id: "ZhzVZxAAAOMK0PiJ", link_type: "Document" },
nextChapter: undefined,
contents: [ ... ]
}
As you can see we can use this method to compose multiple transformations on a document fields, building the resolver for a whole document from smaller pieces. What is left now, is to apply this single-document resolver to all the documents in the processing state:
> resolveReferences = referenceMapper => processingState => {
// To resolve the references we will map over the documents using the `referenceMapper`
const resolveDocumentReferences = mapDocuments(document => {
// Mapping references requires providing resolvers to `referenceMapper`,
// So we create them here
const resolver = makeReferenceResolver({
...processingState, baseUrl: document.path
});
const resolveLinkField = resolveFieldWith(resolver);
// And pass them alongside the document to the `referenceMapper`
return referenceMapper({ document, resolveLinkField })
});
// And then we just apply the document mapper to current processing state
return resolveDocumentReferences(processingState);
}
[Function: resolveReferences]
And now we can resolve those fields in all documents by applying the resolver produced by
resolveReferences(resolveReferencesForStory)
to the processing state:
> await (
processingState
.then(resolveReferences(resolveStoryReferences))
.then(({ documents }) => documents.map(({ data }) => data))
)
[
{
chapterTitle: [ ... ],
chapterIllustration: { id: "ZhzcxLOmp5Xm236V", link_type: "Media" },
previousChapter: undefined,
nextChapter: { id: "ZhzVZxAAAOMK0PiN", link_type: "Document" },
contents: [ ... ]
},
{
chapterTitle: [ ... ],
chapterIllustration: { id: "ZhxIBLOmp5Xm236M", link_type: "Media" },
previousChapter: { id: "ZhxIABAAAJgKzoqK", link_type: "Document" },
contents: [
...
{
type: "image",
alt: null,
url: "../assets/images/magic-battle.png"
},
...
]
},
...
]
As you can see fields like chapterIllustration
or nextChapter
now have properly resolved
references, but it seems we have forgotten about the rich text field! Since we already have
mapRichText
that helps us apply functions to each span and element easily, to add support
for resolving references in a rich text field we only need to write resolvers that will work
with those values.
Those functions should be very similar to each other and hopefully straightforward:
- first check if it's an element/span that is of interest to use (image or link) – otherwise return the element/span unchanged,
- then check if the URL contained in this element/span resolves to an existing asset/document and if so, update the element/span with the resolved value,
- otherwise, return the element/span unchanged.
Let's see how that looks in practice:
// Generates a function that resolves references in a Prismic rich text element
> makeElementResolver = resolveReference => element => {
switch (element.type) {
// We only care about images
case "image": {
// We separate `url` from other element properties,
// to remove it from the resolved element
const { url, ...restElement } = element;
const { id } = resolveReference(url) ?? {};
return id
? { ...restElement, id }
: element;
}
// If it's not an image, return the element as-is
default:
return element;
}
}
[Function: makeElementResolver]
The makeSpanResolver
function for spans is very similar – just focusing on links:
> makeSpanResolver = resolveReference => span => {
switch (span.type) {
// We only care about links
case "hyperlink": {
// We separate `url` from other span properties,
// to remove it from the resolved span
const { url, ...restData } = span.data;
const { id, link_type } = resolveReference(url) ?? {};
return id
? { ...span, data: { ...restData, id, link_type } }
: span;
}
// If it's not a link, return the span as-is
default:
return span;
}
}
[Function: makeSpanResolver]
Now the only thing left is to apply those functions to the rich text. Let's create a rich
text resolver helper, by passing those mapping functions as arguments to mapRichText
:
> resolveRichTextReferences = referenceResolver =>
mapRichText({
element: makeElementResolver(referenceResolver),
span: makeSpanResolver(referenceResolver),
})
[Function: resolveRichTextReferences]
We can now update our resolution functions to work with the rich text field resolver:
> resolveReferences = referenceMapper => processingState => {
const resolveDocumentReferences = mapDocuments(document => {
const resolver = makeReferenceResolver({
...processingState, baseUrl: document.path
});
return referenceMapper({
document,
// We will now pass resolves as an object, to make it more convenient
// to refer to multiple resolvers
resolvers: {
linkField: resolveFieldWith(resolver),
richTextField: resolveFieldWith(resolveRichTextReferences(resolver)),
}
})
});
return resolveDocumentReferences(processingState);
}
[Function: resolveReferences]
> resolveStoryReferences = ({ document, resolvers }) => {
return Promise.resolve(document.data)
.then(resolvers.linkField("chapterIllustration"))
.then(resolvers.linkField("previousChapter"))
.then(resolvers.linkField("nextChapter"))
.then(resolvers.richTextField("contents"))
.then(data => ({ ...document, data }));
}
[Function: resolveStoryReferences]
And now we can see that it also properly resolves references in rich text fields.
> (await processingState.then(resolveReferences(resolveStoryReferences))).documents[1].data
{
chapterTitle: [ ... ],
chapterIllustration: { id: "ZhxIBLOmp5Xm236M", link_type: "Media" },
previousChapter: { id: "ZhxIABAAAJgKzoqK", link_type: "Document" },
contents: [
{
type: "paragraph",
text: "Lorem ipsum dolor sit amet ...",
spans: [
{
type: "hyperlink",
start: 454,
end: 460,
data: { link_type: "Document", id: "ZhxIABAAAJkKzoqD" }
},
{
type: "hyperlink",
start: 672,
end: 678,
data: { link_type: "Document", id: "ZhxIABAAAJkKzoqD" }
}
]
},
...
{
type: "image",
alt: null,
url: id: "ZhxIBrOmp5Xm236O"
},
...
]
}
The only thing left now is to update the processing state with this information and sync the documents with the Migration API again and we're done:
> processingState = processingState.then(resolveReferences(resolveStoryReferences))
Promise { ... }
> processingState = processingState.then(syncWithMigrationRelease())
Promise { ... }
> await processingState
{
documents: [
...
],
documentMap: Map(10) {
...
},
assetMap: Map(9) {
...
}
}
You should now be able to navigate to your migration release and see the documents imported with properly resolved assets and document references in it's fields.
This concludes the walkthrough of the script.