Missing Data
Deprecation notice
The Missing Data API has been deprecated:
- Active Support phase ended 08 September 2022.
- Maintenance Support phase ends 08 December 2022.
- Reaches End of Life on 09 December 2022.
This feature is part of our Machine Learning APIs that are available in the Google Cloud Regions in Europe and North America.
Identifies missing data to improve quality.
The Missing Data API endpoints help to ensure data quality by identifying products which lack important data. Currently, you can query for products found to be missing the following:
- Product attributes
- Product images
- Product prices
In the future, these endpoints will provide automated recommendations on how to fill missing data.
Requests for the Missing Data API are asynchronous.
This feature is still in beta. If you have feedback or further feature requests, contact Support Portal.
Missing Attributes
A product's ProductType defines its product attributes. However, attributes are only given values when creating a ProductVariant. Sometimes product variants have product attributes with no values or only use a subset of the available product attributes.
The Missing Attributes API identifies products and product variants that:
- Are missing attributes entirely
- Are missing attribute values
The default settings identify product variants which meet either conditions.
Representations
MissingAttributes
productId
- Reference to a ProductproductTypeId
- Reference to a ProductTypevariantId
- Integer
ID of a ProductVariant.missingAttributeValues
- Array of Strings
The names of the attributes found without values in a product variant, sorted by attribute importance in descending order.missingAttributeNames
- Array of Strings - Optional
The names of the attributes of the product type that the variant is missing, sorted by attribute importance in descending order.attributeCount
- AttributeCount - OptionalattributeCoverage
- AttributeCoverage - Optional
AttributeCount
productTypeAttributes
- Integer
Number of attributes defined in the product type.variantAttributes
- Integer
Number of attributes defined in the variant.missingAttributeValues
- Integer
Number of attributes missing values in the variant.
AttributeCoverage
names
- Float
Range: [0.0
-1.0
]
The percentage of attributes from the product type defined in the product variant. A value of1.0
indicates a product variant contains all attributes defined in the product type.values
- Float
Range: [0.0
-1.0
]
Represents the percentage of attributes in the product variant that contain values.
MissingAttributesMeta
productLevel
- MissingAttributesProductLevelvariantLevel
- MissingAttributesVariantLevelproductTypeIds
- Array of Strings - Optional
The IDs of the product types containing the requestedattributeName
.
MissingAttributesProductLevel
total
- Integer
Number of products scanned.missingAttributeNames
- Integer
Number of products missing attribute names.missingAttributeValues
- Integer
Number of products missing attribute values.
MissingAttributesVariantLevel
total
- Integer
Number of variants scanned.missingAttributeNames
- Integer
Number of variants missing attribute names.missingAttributeValues
- Integer
Number of variants missing attribute values.
MissingAttributesSearchRequest
limit
- Number - Optionaloffset
- Number - Optionalstaged
- Boolean - Optional
Default value:false
If true, searches data from staged products in addition to published products.productSetLimit
- Number - Optional
Default value:100000
- Range: [1
-100000
]
Maximum number of products to scan. If you need to scan more than 100000 products, contact Support Portal.includeVariants
- Boolean - Optional
Default value:true
If true, searches all product variants. If false, only searches master variants.coverageMin
- Float - Optional
Default value:0.0
- Range: [0.0
-1.0
]
Minimum attribute coverage of variants to display, applied to both coverage types.coverageMax
- Float - Optional
Default value:1.0
- Range: [0.0
-1.0
]
Maximum attribute coverage of variants to display, applied to both coverage types.sortBy
- String - Optional
Default value:coverageAttributeValues
- Allowed values: [coverageAttributeValues
,coverageAttributeNames
]coverageAttributeValues
shows the product variants with the most missing attribute values first andcoverageAttributeNames
the ones with the most missing attribute names.showMissingAttributeNames
- Boolean - Optional
Default value:true
If true, themissingAttributeNames
will be included in the results.productIds
- Array of Strings - Optional
Filters results by the provided Product IDs. Cannot be applied in combination with any other filter.productTypeIds
- Array of Strings - Optional
Filters results by the provided product type IDs. Cannot be applied in combination with any other filter.attributeName
- String - Optional
Filters results by the provided attribute name. If provided, products are only checked for this attribute. Therefore, only products of product types which define the attribute name are considered. These product type IDs are then listed inMissingAttributesMeta
. TheattributeCount
andattributeCoverage
fields are not part of the response when using this filter. Cannot be applied in combination with any other filter.
Represents a URL path to poll to get the results of asynchronous requests.
taskId
- String
The ID for the task. Used to find the status of the task.uriPath
- String
The URI path to poll for the status of the task.
Query Missing Attributes
Initiation endpoint
Host: one of the Machine Learning hosts.
Endpoint: /{projectKey}/missing-data/attributes
Method: POST
OAuth 2.0 Scopes: view_products:{projectKey}
Request Representation: MissingAttributesSearchRequest
Response Representation: TaskToken
Status endpoint
After completing a search, the status endpoint serves the results for one day. If a search completes unsuccessfully, the status endpoint returns an error response.
Host: one of the Machine Learning hosts.
Endpoint: /{projectKey}/missing-data/attributes/status/{task_id}
Method: GET
OAuth 2.0 Scopes: view_products:{projectKey}
Response Representation: TaskStatus of a PagedQueryResult with results
containing an array of MissingAttributes. The results array is sorted first by the selectedsortBy
coverage value in ascending order and secondly by the other. Themeta
has the MissingAttributesMeta representation.
curl -X POST https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/attributes \-H "Content-Type: application/json" \-H 'Authorization: Bearer {access_token}' \-d \'{"staged": true,"limit": 1}'
{"taskId": "b37e87f3-1d2b-4550-83ef-bd2e7e7f1b09","uriPath": "/{projectKey}/missing-data/attributes/status/b37e87f3-1d2b-4550-83ef-bd2e7e7f1b09"}
curl -H 'Authorization: Bearer {access_token}' https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/attributes/status/b37e87f3-1d2b-4550-83ef-bd2e7e7f1b09
{"result": {"count": 1,"offset": 0,"total": 26137,"meta": {"productLevel": {"total": 2968,"missingAttributeNames": 2968,"missingAttributeValues": 2968},"variantLevel": {"total": 26137,"missingAttributeNames": 26137,"missingAttributeValues": 26137}},"results": [{"product": {"id": "a8e01ea0-4181-4b00-9a2d-198504c6e4bd","typeId": "product"},"productType": {"id": "e7878071-7713-4f37-9ddd-dfc99b9b33dc","typeId": "product-type"},"variantId": 1,"attributeCount": {"productTypeAttributes": 15,"variantAttributes": 12,"missingAttributeValues": 2},"attributeCoverage": {"names": 0.80,"values": 0.83},"missingAttributeNames": ["designer","color","style"],"missingAttributeValues": ["completeTheLook","lookProducts"]}]},"state": "SUCCESS","expires": "2019-01-19T15:00:56.546614Z"}
curl -X POST https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}missing-data/attributes \-H "Content-Type: application/json" \-H 'Authorization: Bearer {access_token}' \-d \'{"staged": true,"limit": 1,"attributeName":"color"}'
{"taskId": "37fa717f-7d17-4a27-8593-1a73ea7e4f2c","uriPath": "/{projectKey}/missing-data/attributes/status/37fa717f-7d17-4a27-8593-1a73ea7e4f2c"}
curl -H 'Authorization: Bearer {access_token}' https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/attributes/status/37fa717f-7d17-4a27-8593-1a73ea7e4f2c
{"result": {"count": 1,"offset": 0,"total": 173,"meta": {"productLevel": {"total": 1268,"missingAttributeNames": 47,"missingAttributeValues": 0},"variantLevel": {"total": 16137,"missingAttributeNames": 173,"missingAttributeValues": 0},"productTypeIds": ["e7878071-7713-4f37-9ddd-dfc99b9b33dc"]},"results": [{"product": {"id": "a8e01ea0-4181-4b00-9a2d-198504c6e4bd","typeId": "product"},"productType": {"id": "e7878071-7713-4f37-9ddd-dfc99b9b33dc","typeId": "product-type"},"variantId": 1,"missingAttributeNames": ["color"],"missingAttributeValues": []}]},"state": "SUCCESS","expires": "2019-02-12T15:55:26.860951Z"}
Missing Images
This API searches for products with missing images. The default settings return product variants which do not have an image.
Additional parameters can search for products that have less than a specified number of images (threshold
) or less than the
median number of images per variant (autoThreshold
) in the project.
Representations
MissingImages
productId
- Reference to a ProductvariantId
- Integer
ID of the ProductVariant.imageCount
- Integer
Number of images the variant contains.
MissingImagesMeta
productLevel
- MissingImagesProductLevelvariantLevel
- MissingImagesVariantLevelthreshold
- Integer
The minimum number of images a product variant must have. Anything below this value is considered a product variant with missing images.
MissingImagesProductLevel
total
- Integer
Number of products scanned.missingImages
- Integer
Number of products missing images.
MissingImagesVariantLevel
total
- Integer
Number of product variants scanned.missingImages
- Integer
Number of product variants missing images.
MissingImagesSearchRequest
limit
- Number - Optionaloffset
- Number - Optionalstaged
- Boolean - Optional
Default value:false
If true, searches data from staged products in addition to published products.productSetLimit
- Number - Optional
Default value:100000
- Range: [1
-100000
]
Maximum number of products to scan. If you need to scan more than 100000 products, contact Support Portal.includeVariants
- Boolean - Optional
Default value:true
If true, searches all product variants. If false, only searches master variants.autoThreshold
- Boolean - Optional
Default value:false
If true, uses the median number of images per product variant as a threshold value.threshold
- Number - Optional
Default value:1
The minimum number of images a product variant must have. Anything below this value is considered a product variant with missing images.productIds
- Array of Strings - Optional
Filters results by the provided Product IDs. Cannot be applied in combination with any other filter.productTypeIds
- Array of Strings - Optional
Filters results by the provided product type IDs. It cannot be applied in combination with any other filter.
Query Missing Images
Initiation endpoint
Host: one of the Machine Learning hosts.
Endpoint: /{projectKey}/missing-data/images
Method: POST
OAuth 2.0 Scopes: view_products:{projectKey}
Request Representation: MissingImagesSearchRequest
Response Representation: TaskToken
Status endpoint
After completing a search, the status endpoint serves the results for one day. If a search completes unsuccessfully, the status endpoint returns an error response.
Host: one of the Machine Learning hosts.
Endpoint: /{projectKey}/missing-data/images/status/{task_id}
Method: GET
OAuth 2.0 Scopes: view_products:{projectKey}
Response Representation: TaskStatus of a PagedQueryResult with
results
containing an array of MissingImages and the meta
information of MissingImagesMeta.
curl -X POST https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/images \-H "Content-Type: application/json" \-H 'Authorization: Bearer {access_token}' \-d \'{"staged": true,"limit": 1}'
{"taskId": "3508ab41-59cf-4130-be4f-cfe79c78d436","uriPath": "/{projectKey}/missing-data/images/status/3508ab41-59cf-4130-be4f-cfe79c78d436"}
curl -H 'Authorization: Bearer {access_token}' https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/images/status/3508ab41-59cf-4130-be4f-cfe79c78d436
{"result": {"count": 1,"offset": 0,"total": 7,"meta": {"threshold": 1,"productLevel": {"total": 2968,"missingImages": 1},"variantLevel": {"total": 29105,"missingImages": 7}},"results": [{"product": {"id": "a2a9db01-00fe-436d-b56e-2545642984c0","typeId": "product"},"variantId": 2,"imageCount": 0}]},"state": "SUCCESS","expires": "2019-01-19T15:09:27.791377Z"}
Missing Prices
This API identifies products with missing prices. The default settings return product variants that do not contain prices or have some empty prices.
Additional parameters can be used to identify prices set to 0
also as missing (zeroAsEmpty
) and to check whether there are variants with no valid prices for a specified date range
(validFrom
, validUntil
).
Representations
MissingPrices
productId
- Reference to a ProductvariantId
- Integer
Id of the ProductVariant.
MissingPricesMeta
productLevel
- MissingPricesProductLevelvariantLevel
- MissingPricesVariantLevel
MissingPricesProductLevel
total
- Integer
Number of products scanned.missingPrices
- Integer
Number of products missing prices.
MissingPricesVariantLevel
total
- Integer
Number of product variants scanned.missingPrices
- Integer
Number of product variants missing prices.
MissingPricesSearchRequest
limit
- Number - Optionaloffset
- Number - Optionalstaged
- Boolean - Optional
Default value:false
If true, searches data from staged products in addition to published products.productSetLimit
- Number - Optional
Default value:100000
- Range: [1
-100000
]
Maximum number of products to scan. If you need to scan more than 100000 products, contact Support Portal.includeVariants
- Boolean - Optional
Default value:true
If true, searches all product variants. If false, only searches master variants.currencyCode
- String - Optional
If used, only checks if a product variant has a price in the provided ISO 4217 currency code.checkDate
- Boolean - Optional
Default value:false
If true, checks if there are prices for the specified date range and time.validFrom
- DateTime - Optional
Starting date of the range to check. If no value is given, checks prices valid at the time the search is initiated.validUntil
- DateTime - Optional
Ending date of the range to check. If no value is given, it is equal tovalidFrom
.productIds
- Array of Strings - Optional
Filters results by the provided Product IDs. Cannot be applied in combination with theproductTypeIds
filter.productTypeIds
- Array of Strings - Optional
Filters results by the provided product type IDs. Cannot be applied in combination with theproductIds
filter.
Query Missing Prices
Initiation endpoint
Host: one of the Machine Learning hosts.
Endpoint: /{projectKey}/missing-data/prices
Method: GET
OAuth 2.0 Scopes: view_products:{projectKey}
Request Representation: MissingPricesSearchRequest
Response Representation: TaskToken
Status endpoint
After completing a search, the status endpoint serves the results for one day. If a search completes unsuccessfully, the status endpoint returns an error response.
Host: one of the Machine Learning hosts.
Endpoint: /{projectKey}/missing-data/prices/status/{task_id}
Method: GET
OAuth 2.0 Scopes: view_products:{projectKey}
Response Representation: TaskStatus of a PagedQueryResult with
results
containing an array of MissingPrices and the meta
information of MissingImagesMeta.
curl -X POST https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/prices \-H "Content-Type: application/json" \-H 'Authorization: Bearer {access_token}' \-d \'{"staged": true,"limit": 2}'
{"taskId": "34a771e8-cd39-44c4-8989-181e6b588a7f","uriPath": "/{projectKey}/missing-data/prices/status/34a771e8-cd39-44c4-8989-181e6b588a7f"}
curl -H 'Authorization: Bearer {access_token}' https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/prices/status/34a771e8-cd39-44c4-8989-181e6b588a7f
{"result": {"count": 2,"offset": 0,"total": 122,"meta": {"productLevel": {"total": 2828,"missingPrices": 122},"variantLevel": {"total": 2828,"missingPrices": 122}},"results": [{"product": {"id": "d637f362-0940-4902-9e8e-a361ff71d569","typeId": "product"},"variantId": 1},{"product": {"id": "9a47c2be-c842-4403-a779-64f8872d587d","typeId": "product"},"variantId": 1}]},"state": "SUCCESS","expires": "2019-01-19T16:18:45.121950Z"}