Skip to content

Origin Strategies

When registering an asset, you supply an origin from which the platform can fetch the asset for processing.

If this origin is an open-to-all HTTP URL, the platform can fetch it without any further information. This is not practical in many scenarios, where the origin might be at:

  • an access-controlled repository
  • an AWS S3 bucket
  • an SFTP server
  • a web server protected with HTTP Basic Authentication

For some delivery channels, the platform only needs to see the asset at registration time (for example, a video that the platform will transcode to other formats, or an image set without the use-original policy). The platform only fetches the origin asset URI again if asked to re-process the asset - it doesn’t need access to it in order to serve requests from end users. However, other delivery channels require the platform to have high speed access to the origin at any time. For most origins this would not be reliable, so the platform makes a copy of the original asset and stores it locally, contributing to a customer’s storage usage.

In some circumstances you want the platform to assume that it can always get the asset from the origin - quickly, and at any time. This is known as an optimised origin. A common scenario is where the origin is a URI in a repository that the platform has credentials to access - it can treat the origin as part of its own storage, and avoid duplicating the asset locally.

Two types of resource are used to configure how the platform requests origin URIs for a customer.

The platform provides some out-of-the-box strategies, identified by URI. These are globally available and listed at (EntryPoint)/originStrategies. You can’t edit these.

{
"@context": "https://dlcs.github.io/vocab/context/future.json",
"@id": "https://api.dlcs.example/originStrategies",
"@type": "Collection",
"totalItems": 3,
"pageSize": 100,
"member": [
{
"@id": "https://api.dlcs.example/originStrategies/basic-http-authentication",
"@type": "vocab:OriginStrategy",
"requiresCredentials": true
},
{
"@id": "https://api.dlcs.example/originStrategies/s3-ambient",
"@type": "vocab:OriginStrategy",
"requiresCredentials": false
},
{
"@id": "https://api.dlcs.example/originStrategies/sftp",
"@type": "vocab:OriginStrategy",
"requiresCredentials": true
}
]
}
{
"@context": "https://dlcs.github.io/vocab/context/future.json",
"@id": "https://api.dlcs.example/originStrategies/basic-http-authentication",
"@type": "vocab:OriginStrategy",
"requiresCredentials": true
}

/originStrategies/{originStrategy}

MethodLabelExpectsReturnsStatus
GETRetrieve an origin strategy-vocab:OriginStrategy200 OK, 404 Not Found

Whether the platform needs stored credentials to fetch images with this strategy.

domainrangereadonlywriteonly
vocab:OriginStrategyxsd:booleanTrueFalse

When configuring how the platform obtains your assets, you reference one of these in your own CustomerOriginStrategy as described next.

The platform has the above predefined set of mechanisms for obtaining resources over HTTP, FTP, S3 etc. In your customer origin strategies you combine a predefined strategy to a regular expression that matches your origin URLs, and, if required, credentials for the platform to use when requesting your assets. For basic authentication, sftp and other simple access control mechanisms the platform needs to store your credentials securely, but it never reflects them back to you via the API. You can update these stored credentials, but never learn what they are via the API.

You use regular expressions to match origins instead of having to define an origin strategy per-asset, which would be cumbersome.

{
"@context": "https://dlcs.github.io/vocab/context/future.json",
"@id": "https://api.dlcs.example/customers/2/originStrategies/48702c3d-0529-4b52-9433-7f7f04e91e33",
"@type": "vocab:CustomerOriginStrategy",
"regex": "*",
"strategy": "https://api.dlcs.example/originStrategies/basic-http-authentication",
"credentials": "xxx",
"optimised": false,
"order": 1
}

/customers/{customer}/originStrategies/{id}

MethodLabelExpectsReturnsStatus
GETRetrieve a Customer Origin Strategy-vocab:CustomerOriginStrategy200 OK, 404 Not Found
PUTUpdate a Customer Origin Strategy (but not create one)vocab:CustomerOriginStrategyvocab:CustomerOriginStrategy200 OK, 400 Bad Request, 404 Not Found
DELETEDelete a Customer Origin Strategy-owl:Nothing204 No Content

You create a Customer Origin Strategy with a POST to the parent collection /customers/{customer}/originStrategies. You cannot create one with PUT; the platform always assigns a unique GUID path element.

With a PUT, you can supply any or all of the fields regex, strategy, optimised, and order. For strategies that require credentials, credentials must also be present in the PUT body for validation, but the stored credentials are not updated by a PUT to this resource — use the credentials sub-resource for that.

domainrangereadonlywriteonly
vocab:CustomerOriginStrategy🔗 vocab:OriginStrategyFalseFalse

The URI of the strategy, which may be expressed in compact or expanded form - e.g., compact as basic-http-authentication.

domainrangereadonlywriteonly
vocab:CustomerOriginStrategyxsd:stringFalseFalse

Regex for matching origin. When the platform tries to work out how to fetch from your origin, it uses this regex to match to find the correct strategy. Some examples:

  • "regex": "*"
  • "regex": "sftp\:\/\/london\.digirati\.com\:22796\/.*"
  • "regex": "https\:\/\/s3\-eu\-west\-1\.amazonaws\.com\/bucketname\/.*(?<!\.tif|\.tiff)$"

The value of regex must be unique - you can’t have multiple CustomerOriginStrategy with the same regex; an attempt to create or update one will result in HTTP 409 Conflict.

domainrangereadonlywriteonly
vocab:CustomerOriginStrategyxsd:integerFalseFalse

Specifies the order in which the regex should be tested against a candidate origin. Once the platform has a match it won’t test further CustomerOriginStrategy. This allows you to have a catch-all policy with a higher order (the ultimate catch-all is no policy, if no regexes match; this means the platform will just try to fetch from origin without any special behaviour).

The value of order does not need to be unique within a customer.

domainrangereadonlywriteonly
vocab:CustomerOriginStrategyxsd:booleanFalseFalse

Whether the platform can reliably fetch from this location on-demand.

This is currently only valid when the strategy is s3-ambient.

domainrangereadonlywriteonly
vocab:CustomerOriginStrategyxsd:string*FalseTrue

The credentials property must be supplied in a POST to the parent collection, and must also be present in a PUT for strategies that require credentials. It will never be rendered back via the API — it will always appear as xxx.

*When supplied, its value is a string of escaped JSON, rather than a JSON object. This is to allow arbitrary JSON serialisations of credentials without requiring a specific type here. For credentials that take the form of a user name and a password, the supplied value should be an escaped JSON object with the fields user and password, like this:

"credentials": "{ \"user\": \"uuu\", \"password\": \"ppp\" }"

and NOT like this:

"credentials": { "user": "uuu", "password": "ppp" }

This allows for other forms of credential in future.