Uploading Documents

Upload documents to your new custom data source

Listed below are the required and optional fields needed to upload documents to your new custom data source. A document in this case is a Mention and its metadata, such as the Mention's text, author or date.

🚧

"items" array

All of the following required/optional fields for each document are placed within an array called items as seen in the example calls below.

Required Fields

ParameterDefinitionAccepted Values
contentsMain body text of the mention. Max length is 16k characters
dateDate associated with the document. Must be a valid ISO-style date stringyyyy-MM-dd['T'HH:mm:ss]

Optional Fields

ParameterDefinition
urlOptional if guid is supplied, otherwise required.

URL associated with the document (should be unique within data source). Must be a valid URL.
guidOptional if url is supplied, otherwise required.

User-supplied unique identifier for document (should be unique within data source). Max length of 1k characters. If the guid is re-used, then original document will be replaced. If date is different, but guid is the same, there is some undefined behavior for real-time matching (documents may end up appearing multiple times in a dashboard). Backfilling should fix this issue.
authorMax length of 200 characters
titleMax length of 200 characters
genderm,f,male or female
languageMust be a valid language code. If this value is not set, then we will attempt to identify it automatically based on the contents field.
geolocationid: A valid BCR geolocation ID (please contact [email protected] for a list of location IDs)

latitude/longitude: In degrees

zipcode: A valid US zipcode (5 digit number as a string)
parentGuidMax length of 1000 characters
It can be queried via the engagingWithGuid: operator
engagementTypeValues can be: comment, reply, retweet
It can be queried via the engagementType: operator
pageIdMax length of 1000 characters
authorProfileIdMax length of 1000 characters
batchThe identifier for the set of documents being uploaded; it can be specified when uploading new custom documents. If it's not specified, it will be automatically assigned.
categoriesAn array of user defined categories to upload with the document. See below for more details of the format.

Example Call

The following call uploads two documents. The DATA_SOURCE_NUMERIC_ID value comes from the id field we noted when creating a custom data source and is entered in the contentSource field when making this request. You may also upload individual documents to a specific batch (group) of documents by adding the batch id:

curl -X POST 'https://api.brandwatch.com/content/upload' \
-H 'authorization: bearer xxxxxx-xxxxxx-xxxxxx-xxxxxx-xxxx' \
-H 'Content-Type: application/json' \
-d '
{
  "items": [
    {
      "guid": "3d101fd9b2004a11a76ba1ea637eb9f2",
      "gelocation": {
        "id": "USA.fl"
      },
      "date": "2020-03-25T15:04:00",
      "contents": "testing the data upload API",
      "custom": {
        "myfield": "testmetric"
      }
    },
    {
      "guid": "3d101fd9b2004a11a76ba1ea637eb9f3",
      "gelocation": {
        "id": "USA.pa"
      },
      "date": "2020-03-25T15:05:00",
      "contents": "testing the data upload API..again",
      "custom": {
        "npsCategory": "Promoter"
      }
    }
  ],
  "contentSource": 34354220140,
  "batch": "yourBatchIdHere-12345"
}

Here is what the response looks like:

{
    "uploadCount":2,
    "Batch":"yourBatchIdHere-12345"
}

❗️

Limits & Usage Reporting

You are limited to uploading 1,000 documents per request. The uploadCount value in the JSON response represents the number of documents you've just uploaded.

There is also a limit to how many documents can be uploaded in a 30 day/24 hour period. There two options to monitor usage which you can find in the article Usage Reporting.

Custom Fields

Each document can have a set of custom fields associated with them. These custom fields are arbitrary key value pairs that you can use to upload your own categorization for the uploaded custom data. They are mostly used for filtering documents (at the query and/or dashboard level), but can also be used in rules and tags. For example, if you upload product reviews data, you can use a custom field to upload the product rating for each uploaded review. You can use alphanumeric characters and _ in the name(s) of your new custom field(s).

{
  "items": [
    {
      "guid": "...",
      "custom": {
        "myField1": "this is some text",
        "anotherField1": "with some different text"
      }
    },
    {
      "guid": "...",
      "custom": {
        "myField2": "this is some text again",
        "anotherField2": "with some different text again"
      }
    }
  ],
  "contentSource": DATA_SOURCE_NUMERIC_ID
}

There is a 100 character limit for the names of custom fields, and a 10,000 character limit for the contents.

Text upload in custom fields are tokenized, but not in the same way as regular contents. The text is simply lowercased and split by white space. This means that punctuation characters (and any other special characters) will be retained and can be search for.

To filter your data using custom fields, you can use the custom_CustomFieldName: operator. Replace CustomFieldName with the name of your custom field and add the value of your custom field after the operator as seen in the example below:

custom_NPSCategory:Promoter

This would match documents that had a custom field "NPSCategory" with date of "Promoter" or "Promoter Something" etc.

Numeric Custom Fields

If you upload data looks like a number (e.g. 123 or 10.5) into a custom field then you then you can treat it as a numeric custom field. This unlocks a few of things:

  • Sorting mentions by numeric values
  • Sum and average of field in charts
  • Filtering by numeric ranges

The first two should be available in the BCR UI itself.

Filtering requires the customNumeric_CustomFieldName special operator (which is different to the custom_CustomFieldName). Again you replace CustomFieldName with your field name (case will matter). Then you can use a "range" syntax to specify the range of data you want to match:

  • customNumeric_NPS:[* TO 3] - filter to find docs with NPS less than or equal to 3
  • customNumeric_NPS:[2 TO *] - filter to find docs with NPS greater than or equal to 2
  • customNumeric_NPS:[2 TO 3] - filter to find docs with NPS between 2 and 3

NB. the numbers are always assumed to be decimals, so depending on your data you might need to allow for that when filtering.

❗️

Custom Fields Limit

There is a limit of 10 custom fields per data source. As documents are uploaded, the custom fields used are tracked. If the existing field names used and the new field names in an upload would exceed 10, then you will receive an error (HTTP 400) when trying to upload the documents.

Preassigned Categories

It can be useful to upload documents with some categories pre-assigned. This can work well in conjunction with numeric custom fields, as it unlocks the ability to segment your own data in an arbitrary manner. e.g. you could upload review data with a category for the product type and a numeric custom field for the rating. You would then be able to show custom charts of average review scores broken down by product type.

You will need to have created your categories using either BCR itself or use the API to manage categories.

To upload a document with categories pre-assigned you just need to add a categories array to the document. Each element in the array will be an object with two fields:

FieldDefinition
idThe ID of the category (can be discovered via the API)
projectIdThe ID of the project the category is part of.

An example call to upload a document with a category assigned:

curl -X POST 'https://api.brandwatch.com/content/upload' \
-H 'authorization: bearer xxxxxx-xxxxxx-xxxxxx-xxxxxx-xxxx' \
-H 'Content-Type: application/json' \
-d '
{
  "items": [
    {
      "guid": "3d101fd9b2004a11a76ba1ea637eb9f2",
      "date": "2020-03-25T15:04:00",
      "contents": "testing the data upload API",
      "categories": [
          {"id": 1234, "projectId" 7890}
      ]
    }
  ],
  "contentSource": 34354220140
}

The category and project IDs will be validated to ensure they exist and that you have access to them. There are also other validation steps to ensure you are not assigning multiple categories that would conflict and so on.

🚧

Categories Limit

There is a limit of 10 categories per document uploaded. Each document can have 10 different categories though.

❗️

Categories Are Project-Scoped

If you view your uploaded data in two different projects you will not see the categories in both of those projects. You will only see the categories that related to the specific project your query belongs to.

❗️

Interaction With Rules And Manually Assigned Categories

There are some non-obvious things to bear in mind when you upload documents with categories.
When a document is uploaded with categories this is only the initial set of categories. Rules and manual category assignment can and will change those categories further.

If you want to opt out your content source from a rule you can use NOT pubType:<CONTENT-SOURCE-NAME> in your rule.

If you re-upload a document a user had previously assigned a category to, then the user's category will be re-applied (potentially replacing the category initially uploaded).


What’s Next

Once you've uploaded all of your documents to your new custom data source, you must create a query to retrieve this data.