NAV Navbar
Logo
shell php python csharp

Website Category API Introduction

Example JSON response for a category lookup of webshrinker.com

HTTP/1.1 200 OK
Content-Type: application/json

{
    "data": [
        {
            "categories": [
                {
                    "confident": true,
                    "id": "IAB19",
                    "label": "Technology & Computing",
                    "parent": "IAB19",
                    "score": "0.855809166500086094"
                },
                {
                    "confident": true,
                    "id": "IAB19-18",
                    "label": "Internet Technology",
                    "parent": "IAB19",
                    "score": "0.824063117153139624"
                }
            ],
            "url": "webshrinker.com"
        }
    ]
}

The Webshrinker Category API gives developers the ability to lookup the categories that a particular URL, website, domain name, or IP address is categorized as.

URLs

Querying the categories for a URL will return the categories specific to that URL, not the domain name. This can be used to analyze the content present on a specific page of a website.

Websites / Domain Names

Queries for a domain name, like example.com, will return the main categories associated with that site and its content.

IP Addresses

Queries for IP addresses will return the most relevant categories for all of the content we’ve seen hosted on that IP address. This can be used in situations where you don’t know which domain name to lookup but have an IP address.

Authentication

To make API requests you need to have an access key and secret key. If you don’t have an access key yet, you can register for a free account on our account dashboard.

There are two methods of authentication that can be used to make API requests: Basic HTTP Authentication and Pre-signed URLs.

Basic HTTP Authentication

Example HTTP request:

GET /categories/v3/d2Vic2hyaW5rZXIuY29t HTTP/1.1
Authorization: Basic Yzk1NDJkMDFkNjlmOjE3OTAyYzJjOWIzYQ==
Host: api.webshrinker.com

Basic authentication requires that you send your API access and secret key with each request to the service over HTTPS. Most programming frameworks and SDKs support sending Basic HTTP Authentication “out of the box” with a simple function call.

The access key and secret key are used as the “username” and “password” for Basic Authentication.

If you are crafting the “Authorization” header yourself it is the word “Basic” followed by the base64-encoded string “your-access-key:your-secret-key”. Additional information about this HTTP header can be found on Basic access authentication on Wikipedia.

Pre-signed URLs

WS_ACCESS_KEY="your access key"
WS_SECRET_KEY="your secret key"

URL=($(echo -n "https://www.webshrinker.com" | base64))

REQUEST="categories/v3/$URL?key=$WS_ACCESS_KEY"
HASH=($(echo -n "$WS_SECRET_KEY:$REQUEST" | (md5sum || md5)))

echo "https://api.webshrinker.com/$REQUEST&hash=$HASH"
<?php

function webshrinker_categories_v3($access_key, $secret_key, $url="", $options=array()) {
    $options['key'] = $access_key;

    $parameters = http_build_query($options);

    $request = sprintf("categories/v3/%s?%s", base64_encode($url), $parameters);
    $hash = md5(sprintf("%s:%s", $secret_key, $request));

    return "https://api.webshrinker.com/{$request}&hash={$hash}";
}

$access_key = "your access key";
$secret_key = "your secret key";

$url = "https://www.webshrinker.com/";
$signedUrl = webshrinker_categories_v3($access_key, $secret_key, $url);

echo "$signedUrl\n";

?>
try:
    from urllib import urlencode
except ImportError:
    from urllib.parse import urlencode

from base64 import urlsafe_b64encode
import hashlib

def webshrinker_categories_v3(access_key, secret_key, url=b"", params={}):
    params['key'] = access_key

    request = "categories/v3/{}?{}".format(urlsafe_b64encode(url).decode('utf-8'), urlencode(params, True))
    request_to_sign = "{}:{}".format(secret_key, request).encode('utf-8')
    signed_request = hashlib.md5(request_to_sign).hexdigest()

    return "https://api.webshrinker.com/{}&hash={}".format(request, signed_request)

access_key = "your access key"
secret_key = "your secret key"

url = b"https://www.webshrinker.com/"
print(webshrinker_categories_v3(access_key, secret_key, url))

Generating a pre-signed URL allows you to make requests without revealing your secret key. It’s perfect for situations where you need to embed a request in an application or in a webpage but don’t want users to know your secret key, preventing a third party from making unauthorized requests against your account.

The URL used to make the API request is signed using your access and secret key but the URL itself doesn’t contain the secret key. Instead it contains an extra parameter called “hash”.

The “hash” is the MD5 hash of your secret key, a colon (“:”), and the request URL with query parameters.

Category Taxonomies

You can choose to use either the “IAB Tech Lab Content Taxonomy / Quality Assurance Guidelines (QAG) Taxonomy” (iabv1) or “Webshrinker” (webshrinker). When you make a category API request, pass in the query parameter “taxonomy” with the value set to “iabv1” or “webshrinker” as needed.

The IAB Content Taxonomy contains over 400 categories and the Webshrinker taxonomy is composed of 40 top-level categories.

See Supported IAB Website Categories or Webshrinker Website Categories for additional information.

Making Requests

List All Categories

Example category list JSON response:

HTTP/1.1 200 OK
Content-Type: application/json

{
    "data": [
        {
            "categories": {
                "IAB1": {
                    "IAB1": "Arts & Entertainment",
                    "IAB1-1": "Books & Literature",
                    "IAB1-2": "Celebrity Fan/Gossip",
                    "IAB1-3": "Fine Art",
                    "IAB1-4": "Humor",
                    "IAB1-5": "Movies",
                    "IAB1-6": "Music & Audio",
                    "IAB1-7": "Television & Video"
                },
                "IAB10": {
                    "IAB10": "Home & Garden",
                    "IAB10-1": "Appliances",
                    "IAB10-2": "Entertaining",
                    "IAB10-3": "Environmental Safety",
                    "IAB10-4": "Gardening",
                    "IAB10-5": "Home Repair",
                    "IAB10-6": "Home Theater",
                    "IAB10-7": "Interior Decorating",
                    "IAB10-8": "Landscaping",
                    "IAB10-9": "Remodeling & Construction"
                },

                <!-- content shortened due to length -->

            }
        }
    ]
}
curl -u "access-key:secret-key" "https://api.webshrinker.com/categories/v3"

or

curl "https://api.webshrinker.com/categories/v3/?key=TvQu6ARhl2Zs7BVV1plU&hash=0f41d9264f05b2aeb0064fc6d7114cbc"
<?php

function webshrinker_categories_v3($access_key, $secret_key, $url="", $options=array()) {
    $options['key'] = $access_key;

    $parameters = http_build_query($options);

    $request = sprintf("categories/v3/%s?%s", base64_encode($url), $parameters);
    $hash = md5(sprintf("%s:%s", $secret_key, $request));

    return "https://api.webshrinker.com/{$request}&hash={$hash}";
}

$access_key = "your access key";
$secret_key = "your secret key";

$url = "https://www.webshrinker.com/"

$request = webshrinker_categories_v3($access_key, $secret_key, $url);

// Initialize cURL and use pre-signed URL authentication
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $request);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

$response = curl_exec($ch);
$status_code = curl_getinfo($ch, CURLINFO_HTTP_CODE);

print_r(json_decode($response));

switch($status_code) {
    case 200:
        // Do something with the JSON response
        break;
    case 400:
        // Bad or malformed HTTP request
        break;
    case 401:
        // Unauthorized
        break;
    case 402:
        // Request limit reached
        break;
}

?>
##########################################################################################
# NOTE: If you are using Python 2.7.6 you might run into an issue
# with making API calls using the requests library.
# For a workaround, see:
# http://stackoverflow.com/questions/31649390/python-requests-ssl-handshake-failure
##########################################################################################

try:
    from urllib import urlencode
except ImportError:
    from urllib.parse import urlencode

from base64 import urlsafe_b64encode
import hashlib
import requests
import json

def webshrinker_categories_v3(access_key, secret_key, url=b"", params={}):
    params['key'] = access_key

    request = "categories/v3/{}?{}".format(urlsafe_b64encode(url).decode('utf-8'), urlencode(params, True))
    request_to_sign = "{}:{}".format(secret_key, request).encode('utf-8')
    signed_request = hashlib.md5(request_to_sign).hexdigest()

    return "https://api.webshrinker.com/{}&hash={}".format(request, signed_request)

access_key = "your access key"
secret_key = "your secret key"

api_url = webshrinker_categories_v3(access_key, secret_key)
response = requests.get(api_url)

status_code = response.status_code
data = response.json()

if status_code == 200:
    # Do something with the JSON response
    print(json.dumps(data, indent=4, sort_keys=True))
elif status_code == 202:
    # The website is being visited and the categories will be updated shortly
    print(json.dumps(data, indent=4, sort_keys=True))
elif status_code == 400:
    # Bad or malformed HTTP request
    print("Bad or malformed HTTP request")
    print(json.dumps(data, indent=4, sort_keys=True))
elif status_code == 401:
    # Unauthorized
    print("Unauthorized - check your access and secret key permissions")
    print(json.dumps(data, indent=4, sort_keys=True))
elif status_code == 402:
    # Request limit reached
    print("Account request limit reached")
    print(json.dumps(data, indent=4, sort_keys=True))
else:
    # General error occurred
    print("A general error occurred, try the request again")
using System;
using System.Text;
using System.Net;
using System.IO;

string apiKey = "your access key";
string apiSecret = "your secret key";

string apiUrl = "https://api.webshrinker.com/categories/v3";

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(apiUrl);
request.ContentType = "application/json; charset=utf-8";
request.Headers["Authorization"] = "Basic " + Convert.ToBase64String(Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey+":"+apiSecret));
request.PreAuthenticate = true;
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
using (Stream responseStream = response.GetResponseStream())
{
    StreamReader reader = new StreamReader(responseStream, Encoding.UTF8);
    Console.WriteLine(reader.ReadToEnd());
}

This endpoint returns JSON with all of the possible categories that URLs, hostnames, and IP addresses can be associated with.

HTTP Request

GET https://api.webshrinker.com/categories/v3

Query Parameters

Parameter Required Default Description
key true (if using Pre-signed URLs) Your account access key to use for the request.
expires false 0 A unix timestamp of a future date when the pre-signed URL request will expire and cannot be used any more.
taxonomy false iabv1 Which category taxonomy to use, either “iabv1” or “webshrinker”
_ false Can be used as a cache buster to force a users browser to load the latest result. A good value for this would be the current unix timestamp. If using pre-signed URLs, do not include this parameter when generating the hash.

Category Lookup

Example JSON response:

HTTP/1.1 200 OK
Content-Type: application/json

{
    "data": [
        {
            "categories": [
                {
                    "confident": true,
                    "id": "IAB19",
                    "label": "Technology & Computing",
                    "parent": "IAB19",
                    "score": "0.855809166500086094"
                },
                {
                    "confident": true,
                    "id": "IAB19-18",
                    "label": "Internet Technology",
                    "parent": "IAB19",
                    "score": "0.824063117153139624"
                }
            ],
            "url": "webshrinker.com"
        }
    ]
}
curl -u "access-key:secret-key" "https://api.webshrinker.com/categories/v3/d2Vic2hyaW5rZXIuY29t"

or

curl "https://api.webshrinker.com/categories/v3/d2Vic2hyaW5rZXIuY29t?key=TvQu6ARhl2Zs7BVV1plU&hash=afe42ba2e8ae6f9d5ec2a0535ab484fe"
<?php

function webshrinker_categories_v3($access_key, $secret_key, $url="", $options=array()) {
    $options['key'] = $access_key;

    $parameters = http_build_query($options);

    $request = sprintf("categories/v3/%s?%s", base64_encode($url), $parameters);
    $hash = md5(sprintf("%s:%s", $secret_key, $request));

    return "https://api.webshrinker.com/{$request}&hash={$hash}";
}

$access_key = "your access key";
$secret_key = "your secret key";

$url = "https://www.webshrinker.com";
$request = webshrinker_categories_v3($access_key, $secret_key, $url);

// Initialize cURL and use pre-signed URL authentication
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $request);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

$response = curl_exec($ch);
$status_code = curl_getinfo($ch, CURLINFO_HTTP_CODE);

print_r(json_decode($response));

switch($status_code) {
    case 200:
        // Do something with the JSON response
        break;
    case 202:
        // The response may have categories but the system is calculating them again,
        // check back soon
        break;
    case 400:
        // Bad or malformed HTTP request
        break;
    case 401:
        // Unauthorized
        break;
    case 402:
        // Request limit reached
        break;
}

?>
##########################################################################################
# NOTE: If you are using Python 2.7.6 you might run into an issue
# with making API calls using the requests library.
# For a workaround, see:
# http://stackoverflow.com/questions/31649390/python-requests-ssl-handshake-failure
##########################################################################################

try:
    from urllib import urlencode
except ImportError:
    from urllib.parse import urlencode

from base64 import urlsafe_b64encode
import hashlib
import requests
import json

def webshrinker_categories_v3(access_key, secret_key, url=b"", params={}):
    params['key'] = access_key

    request = "categories/v3/{}?{}".format(urlsafe_b64encode(url).decode('utf-8'), urlencode(params, True))
    request_to_sign = "{}:{}".format(secret_key, request).encode('utf-8')
    signed_request = hashlib.md5(request_to_sign).hexdigest()

    return "https://api.webshrinker.com/{}&hash={}".format(request, signed_request)

access_key = "your access key"
secret_key = "your secret key"

url = b"https://www.webshrinker.com/"

api_url = webshrinker_categories_v3(access_key, secret_key, url)
response = requests.get(api_url)

status_code = response.status_code
data = response.json()

if status_code == 200:
    # Do something with the JSON response
    print(json.dumps(data, indent=4, sort_keys=True))
elif status_code == 202:
    # The website is being visited and the categories will be updated shortly
    print(json.dumps(data, indent=4, sort_keys=True))
elif status_code == 400:
    # Bad or malformed HTTP request
    print("Bad or malformed HTTP request")
    print(json.dumps(data, indent=4, sort_keys=True))
elif status_code == 401:
    # Unauthorized
    print("Unauthorized - check your access and secret key permissions")
    print(json.dumps(data, indent=4, sort_keys=True))
elif status_code == 402:
    # Request limit reached
    print("Account request limit reached")
    print(json.dumps(data, indent=4, sort_keys=True))
else:
    # General error occurred
    print("A general error occurred, try the request again")
using System;
using System.Text;
using System.Net;
using System.IO;

string apiKey = "your access key";
string apiSecret = "your secret key";

string domain = "example.com";

string apiUrl = "https://api.webshrinker.com/categories/v3/" + Convert.ToBase64String(Encoding.GetEncoding("ISO-8859-1").GetBytes(domain));

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(apiUrl);
request.ContentType = "application/json; charset=utf-8";
request.Headers["Authorization"] = "Basic " + Convert.ToBase64String(Encoding.GetEncoding("ISO-8859-1").GetBytes(apiKey+":"+apiSecret));
request.PreAuthenticate = true;
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
using (Stream responseStream = response.GetResponseStream())
{
    StreamReader reader = new StreamReader(responseStream, Encoding.UTF8);
    Console.WriteLine(reader.ReadToEnd());
}

This endpoint returns a JSON response with the categories associated with the given URL, hostname, or IP address.

If a non-HTTP 200 response is returned then an “error” attribute will be returned as part of the response. The “error” array contains a human readable message to help with debugging the error condition.

HTTP Request

GET https://api.webshrinker.com/categories/v3/<url>

URL

Parameter Required Default Description
url true The URL is part of the request and not an additional parameter. It needs to be URL safe Base64-encoded for maximum compatibility. It can be a full URL, a domain name, or an IP address.

Query Parameters

Parameter Required Default Description
key true (if using Pre-signed URLs) Your account access key to use for the request.
expires false 0 A unix timestamp of a future date when the pre-signed URL request will expire and cannot be used any more.
taxonomy false iabv1 Which category taxonomy to use, either “iabv1” or “webshrinker”
_ false Can be used as a cache buster to force a users browser to load the latest result. A good value for this would be the current unix timestamp. If using pre-signed URLs, do not include this parameter when generating the hash.

Category State

An HTTP 200 response indicates that the JSON contains the most current, up-to-date categories for the given URL.

An HTTP 202 response is given when the categories are being calculated for the given URL. If you check back again soon the categories for the URL will be updated.

Understanding the Response

IAB Taxonomy

Sample category lookup response for “webshrinker.com”

HTTP/1.1 200 OK
Content-Type: application/json

{
    "data": [
        {
            "categories": [
                {
                    "confident": true,
                    "id": "IAB19",
                    "label": "Technology & Computing",
                    "parent": "IAB19",
                    "score": "0.855809166500086094"
                },
                {
                    "confident": true,
                    "id": "IAB19-18",
                    "label": "Internet Technology",
                    "parent": "IAB19",
                    "score": "0.824063117153139624"
                }
            ],
            "url": "webshrinker.com"
        }
    ]
}

Each detected category will be listed in the “categories” section of the JSON response along with some additional details. The extra details help you understand how relevant that particular category is to the website or the content found on a page.

Field Description
confident If set to true, this indicates that the majority of the analyzed content relates to this category. Categories with the ‘confident’ flag can be useful to indicate primary from secondary categories.
id The IAB category identifier.
label Human friendly label for the detected category. This doesn’t include the label of the parent category.
parent The IAB category identifier for the entry that is one tier higher than the current category, or its parent.
score A floating point number that indicates how much confidence is given to the category selection.

Webshrinker Taxonomy

Sample category lookup response for “webshrinker.com”

HTTP/1.1 200 OK
Content-Type: application/json

{
    "data": [
        {
            "categories": [
                {
                    "id": "business",
                    "label": "Business"
                },
                {
                    "id": "informationtech",
                    "label": "Information Technology"
                }
            ],
            "url": "webshrinker.com"
        }
    ]
}

Each detected category will be listed in the “categories” section of the JSON response.

Field Description
id The Webshrinker category “short name”.
label Human friendly label for the detected category.

Errors

Example bad request response:

HTTP/1.1 401 Unauthorized
Content-Type: application/json

{
    "error": {
        "message": "Invalid signed hash or no API key/secret given"
    }
}
Status Code Meaning
200 OK – The request was successful, the most recent categories are returned
202 Accepted – Your request was successful but is still being processed on the server, check back soon
400 Bad Request – One or more parameters in the request are invalid
401 Unauthorized – Your API key/secret key is wrong or the key doesn’t have permission
402 Payment Required – Your account balance is used up, purchase additional requests in the account dashboard
412 Precondition Failed – Unable to satisfy the category request because there is not enough information available to classify the given URL
429 Too Many Requests – Too many requests in a short time
500 Internal Server Error – There was an issue processing the request, try again