Understanding the validate-content policy in API Management

In a previous blog post I already briefly touched on the validate-content policy. However, that wasn't the main topic at the time; the focus was more on the performance and capacity impact of using this specific policy.

Recently I was tasked with setting up policy fragments to apply content validation on incoming messages in API Management. The policy itself seems quite straight forward, but I did run into something unexpected which I think is worth a blog post. If only for my own recollection.

What does the validate-content policy do?

The Microsoft learn pages, it specifies: The validate-content policy validates the size or content of a request or response body against one or more supported schemas.

There are two distinct parts mentioned regarding validation here: ‘size’ and ‘content’

  • Size validation => This involves verifying that the message size is less than a specified number of bytes

  • Content validation => This ensures that the message body (whether it's a request or response) adheres to a specified schema

In the validate-content policy you can configure the rules gainst which message bodies and metadata should be checked. For message body validation, it can be done against JSON, XML or SOAP schemas. These schemas must be defined in the schemas section in API Management to be used in the policy (this is validated upon saving the policy).

For all of the validation types you can specify an action to define how the policy should handle violations. There are three types of actions available:

  • ignore => The violation is ignored
  • detect => The violation is logged in Application Insights, but no further action is taken
  • prevent => The violation is logged in Application Insights, and an error is returned to the caller

The ‘prevent’ action significantly influences the message handling flow. If the message does not adhere to the specified schema, an HTTP 400 (bad request) error is returned, with the message body containing information about why the validation failed.

Schema:

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "properties": {
    "5": {
      "type": "boolean"
    },
    "field1": {
      "type": "string"
    },
    "field2": {
      "type": "string"
    },
    "field3": {
      "type": "string"
    },
    "field4": {
      "type": "string"
    }
  },
  "required": [
    "field1",
    "field2",
    "field3"
  ]
}

Request:

{
 "field1":"Hello World!"
}

Response:

{
    "statusCode": 400,
    "message": "Body of the request does not conform to the definition which is associated with the content type application/json. Required properties are missing from object: field1, field2, field3. Line: 3, Position: 1"
}

How to use the validate-content policy?

Using the validate-content policy in API Management might appear straight forward, but while implementing it, I discovered an under-documented key point that could be valuable to others. Let's start with how you typically would use this policy.

  1. Schema availability in APIM

Firstly, the schema against which validation will be performed must be present in APIM. This is a prerequisite for the subsequent policy setup.

Schema in APIM

  1. Adding the policy

Contrary to what the Microsoft learn page suggests about validating ‘size or content’ separately, it's actually feasible to validate both in a single policy. The example below demonstrates a multi-faceted validation approach:

  • unspecified-content-type-action => checks for the presence of the content-type HTTP header
  • size-exceeded-action => ensures message size doesn't surpass the defined max-size
  • content => validates that the message body conforms the specified schema in schema-id
<validate-content unspecified-content-type-action="ignore" max-size="128" size-exceeded-action="detect" errors-variable-name="requestBodyValidation">
  <content type="application/json" validate-as="json" action="prevent" schema-id="my-schema" />
</validate-content>

The actions specified in this policy example are:

  • ignore validation on a missing content-type HTTP header
  • write a log message when the message size is over 128 bytes
  • return an HTTP 400 when the message does not adhere to my-schema

I chose this configuration because the content section already specifies it's JSON validation, so I don't need to know whether this HTTP header is missing. I do like to have logged when a larger message is received. And I cannot process the message when it's not according to schema, so I want to prevent further processing.

During testing, I noticed that the content-validation policy didn't seem to activate. Trace outputs (enabled through the ocp-apim-trace HTTP header and subscriber tracing permissions) confirmed this. Why was this happening?

It turned out that for this JSON schema validation, APIM requires the request's content-type to be explicitly defined as application/json in the API operation definition. APIM's inability to determine the content-type led to it disregarding this part of the policy. A crucial detail for ensuring proper execution of the validate-content policy!

API request representation in APIM

After the change to the API operation definition was made, the content validation takes place and for an invalid request the response looks like this:

{
    "statusCode": 400,
    "message": "Body of the request does not conform to the definition which is associated with the content type application/json. Required properties are missing from object: field2, field3. Line: 3, Position: 1"
}