scraping
Website Screenshots Made Easy via API
Simplify website screenshots: A guide using the Adcolabs Scraper
Screenshots of websites can be created in various ways. The fastest and easiest method often involves using the built-in tools of your operating system. Alternatively, many browsers offer corresponding tools. But what if you need to take screenshots regularly? And perhaps not only of the visible area but also of entire pages—and in larger quantities?
For such complex requirements, the Adcolabs-Scraper is an excellent choice. In this article, we will show you with a short example how easy it is to use.
Getting Started
All you need to get started is an account and the API key of the Adcolabs-Scrapers. You can find a step-by-step guide in our blog post on website extraction.
Creating Screenshots
Here is an example of a curl command to take a simple screenshot:
curl -X POST https://api.scraper.adcolabs.de/v1/extractions \
-H "Content-Type: application/json" \
-H "API-KEY: <API-KEY>" \
-d '{"url":"https://adcolabs.de","agent":{"resolution":{"width":"1903","height":"927"},"options":{"headers":[]}},"connectivity":{"proxy":"europe"},"workflow":[{"action":"screenshot"}],"selector":[],"webhook":{"enabled":false,"headers":[]}}'
This command initiates what is called an extraction. The output of the command provides an ID, which we will need for the next steps:
{"id":"67683c348c067b2886aa1d43","url":"https://adcolabs.de","status":"CREATED","created":"2024-12-22T16:20:04.812794718","webhook":{"enabled":false,"headers":[],"successful":false,"retry":0},"agent":{"resolution":{"width":1903,"height":927},"options":{"headers":[]}},"workflow":[{"action":"screenshot","value":0}],"extractions":[],"connectivity":{"proxy":"europe"}}
You can use the ID to retrieve the result:
curl -X GET https://api.scraper.adcolabs.de/v1/extractions/67683c348c067b2886aa1d43 \
-H "Content-Type: application/json" \
-H "API-KEY: <API-KEY>"
Since processing takes a moment, the status will initially show as WAITING:
{"id":"67683c348c067b2886aa1d43","url":"https://adcolabs.de","timeStamp":"2024-12-22T16:20:04.812","agent":{"resolution":{"width":1903,"height":927},"options":{"headers":[]}},"selector":[],"status":"WAITING","workflow":[{"action":"screenshot","value":0}],"webhook":{"enabled":false,"headers":[],"successful":false,"retry":0},"connectivity":{"proxy":"europe"}}
After a few seconds, the status changes to DONE. Additionally, a list of artifact URLs containing the results becomes available:
{"id":"67683c348c067b2886aa1d43","url":"https://adcolabs.de","timeStamp":"2024-12-22T16:20:04.812","agent":{"resolution":{"width":1903,"height":927},"options":{"headers":[]}},"selector":[],"status":"DONE","workflow":[{"action":"screenshot","value":0}],"webhook":{"enabled":false,"headers":[],"successful":false,"retry":0},"artifacts":["https://artifacts.s3c.adcolabs.de/2024-12-22/67683c348c067b2886aa1d43/49d78836a161836a/67683c348c067b2886aa1d43_1734884442.png"],"connectivity":{"proxy":"europe"},"extractionsResponses":[],"outputAgents":{"statusCode":0,"headers":{"accept-ranges":"bytes","content-length":"21067","content-type":"text/html","date":"Sun, 22 Dec 2024 16:20:38 GMT","etag":"\"6767f0be-524b\"","last-modified":"Sun, 22 Dec 2024 10:58:06 GMT","strict-transport-security":"max-age=15724800; includeSubDomains"}}}
With the jq tool, you can list artifacts more clearly:
curl -X GET https://api.scraper.adcolabs.de/v1/extractions/67683c348c067b2886aa1d43 \
-H "Content-Type: application/json" \
-H "API-KEY: <API-KEY>" \
| jq .artifacts
The output will then only display the artifact URLs:
[
"https://artifacts.s3c.adcolabs.de/2024-12-22/67683c348c067b2886aa1d43/49d78836a161836a/67683c348c067b2886aa1d43_1734884442.png"
]
The created screenshot looks like this:
Full Pages and More
With the full-page-screenshot parameter, you can also capture screenshots of entire web pages.
curl -X POST https://api.scraper.adcolabs.de/v1/extractions \
-H "Content-Type: application/json" \
-H "API-KEY: <API-KEY>" \
-d '{"url":"https://www.adcolabs.de/blog/hands-on-seitenextraktion/","agent":{"resolution":{"width":"1903","height":"927"},"options":{"headers":[]}},"connectivity":{"proxy":"europe"},"workflow":[{"action":"full-page-screenshot"}],"selector":[],"webhook":{"enabled":false,"headers":[]}}'
Here’s an example of the result for one of our german blog posts:
Moreover, additional parameters can simulate scrolling and clicking actions. This allows you to create specific states of the web page or target individual areas precisely. You can also add wait times to account for loading effects on the page.
Automated Screenshots with a Bash Script
For easy and flexible website screenshot creation, we’ve prepared a handy Bash script. With just a few parameters, you can capture screenshots and download them directly. The script is highly customizable and supports various screenshot types and output locations.
#!/bin/bash
BROWSER_RESOLUTION_X="1903"
BROWSER_RESOLUTION_Y="927"
MAX_RETRIES=30
RETRY_INTERVAL=3
function display_help {
echo "Usage: $0 --url <website_url> --api-key <api_key> [--type <screenshot_type>] [--output <output_directory>]"
echo
echo "Options:"
echo " --url The URL of the website to capture (required)."
echo " --api-key API key for the screenshot service (required)."
echo " --type Type of screenshot to capture. Options:"
echo " 'screenshot' (default) or 'full-page-screenshot'."
echo " --output Directory to save downloaded artifacts (default: current directory)."
echo " --help Show this help message and exit."
echo
echo "Examples:"
echo " $0 --url http://example.com --api-key YOUR_API_KEY"
echo " $0 --url http://example.com --api-key YOUR_API_KEY --type full-page-screenshot --output /path/to/save"
}
WEBSITE_URL=""
SCREENSHOT_TYPE="screenshot"
OUTPUT_DIR="."
API_KEY=""
while [[ "$#" -gt 0 ]]; do
case "$1" in
--help)
display_help
exit 0
;;
--url)
WEBSITE_URL="$2"
shift 2
;;
--api-key)
API_KEY="$2"
shift 2
;;
--type)
SCREENSHOT_TYPE="$2"
shift 2
;;
--output)
OUTPUT_DIR="$2"
shift 2
;;
*)
echo "Error: Unknown argument '$1'"
echo "Run '$0 --help' for usage information."
exit 1
;;
esac
done
if [[ -z "$WEBSITE_URL" ]]; then
echo "Error: --url is required."
echo "Run '$0 --help' for usage information."
exit 1
fi
if [[ -z "$API_KEY" ]]; then
echo "Error: --api-key is required."
echo "Run '$0 --help' for usage information."
exit 1
fi
if [[ "$SCREENSHOT_TYPE" != "screenshot" && "$SCREENSHOT_TYPE" != "full-page-screenshot" ]]; then
echo "Error: Invalid value for --type. Use 'screenshot' or 'full-page-screenshot'."
echo "Run '$0 --help' for usage information."
exit 1
fi
if [[ ! -d "$OUTPUT_DIR" ]]; then
echo "Error: Output directory '$OUTPUT_DIR' does not exist."
exit 1
fi
echo "Starting screenshot extraction for URL: $WEBSITE_URL with type: $SCREENSHOT_TYPE"
EXTRACTION_PAYLOAD=$(cat <<EOF
{
"url": "${WEBSITE_URL}",
"agent": {
"resolution": {
"width": "${BROWSER_RESOLUTION_X}",
"height": "${BROWSER_RESOLUTION_Y}"
},
"options": {
"headers": []
}
},
"connectivity": {
"proxy": "europe"
},
"workflow": [
{
"action": "${SCREENSHOT_TYPE}"
}
],
"selector": [],
"webhook": {
"enabled": false,
"headers": []
}
}
EOF
)
EXTRACTION_ID=$(curl -s -X POST "https://api.scraper.adcolabs.de/v1/extractions" \
-H "Content-Type: application/json" \
-H "API-KEY: ${API_KEY}" \
-d "$EXTRACTION_PAYLOAD" | jq -r .id)
if [[ -z "$EXTRACTION_ID" || "$EXTRACTION_ID" == "null" ]]; then
echo "Error: Failed to initiate screenshot extraction."
exit 1
fi
echo "Extraction initiated. ID: $EXTRACTION_ID"
RETRIES_LEFT=$MAX_RETRIES
EXTRACTION_STATUS=""
echo -n "Checking extraction status"
while [[ $RETRIES_LEFT -gt 0 ]]; do
echo -n "."
EXTRACTION_STATUS=$(curl -s -X GET "https://api.scraper.adcolabs.de/v1/extractions/${EXTRACTION_ID}" \
-H "Content-Type: application/json" \
-H "API-KEY: ${API_KEY}" | jq -r .status)
case "$EXTRACTION_STATUS" in
DONE)
echo -e "\nExtraction completed successfully."
break
;;
WAITING|CREATED)
((RETRIES_LEFT--))
sleep $RETRY_INTERVAL
;;
*)
echo -e "\nError: Unexpected extraction status - $EXTRACTION_STATUS"
exit 1
;;
esac
done
if [[ "$EXTRACTION_STATUS" != "DONE" ]]; then
echo -e "\nError: Extraction did not complete within the allowed time."
exit 1
fi
echo "Fetching extraction artifacts..."
ARTIFACTS=$(curl -s -X GET "https://api.scraper.adcolabs.de/v1/extractions/${EXTRACTION_ID}" \
-H "Content-Type: application/json" \
-H "API-KEY: ${API_KEY}" | jq -r .artifacts)
if [[ -z "$ARTIFACTS" || "$ARTIFACTS" == "null" ]]; then
echo "Error: No artifacts available for the extraction."
exit 1
fi
echo "Extraction artifacts:"
echo "$ARTIFACTS"
ARTIFACTS_ARRAY=$(echo "$ARTIFACTS" | jq -r '.[]')
for URL in $ARTIFACTS_ARRAY; do
FILENAME=$(basename "$URL")
OUTPUT_PATH="${OUTPUT_DIR}/${FILENAME}"
echo "Downloading $URL to $OUTPUT_PATH"
curl -s -o "$OUTPUT_PATH" "$URL"
if [[ $? -ne 0 ]]; then
echo "Error: Failed to download $URL"
fi
done
echo "All artifacts have been downloaded to $OUTPUT_DIR."
Example Applications of the Screenshot Script
To get an overview of the usage options, run:
./screenshot-downloader.sh --help
Creating a screenshot of the visible area of a website:
./screenshot-downloader.sh --url https://www.adcolabs.de/blog/hands-on-seitenextraktion/ --api-key <API-KEY>
Creating a screenshot of the entire website:
./screenshot-downloader.sh --url https://www.adcolabs.de/blog/hands-on-seitenextraktion/ --type full-page-screenshot --api-key <API-KEY>
Saving the screenshot in a custom directory:
./screenshot-downloader.sh --url https://www.adcolabs.de/blog/hands-on-seitenextraktion/ --output <CUSTOM-DIRECTORY-PATH> --api-key <API-KEY>
Adcolabs Scraper
The Adcolabs Scraper not only allows for screenshots but also supports video recordings of websites. The script can be adjusted to utilize this feature as well. Discover more features directly in the Adcolabs app. Check it out and test the extensive possibilities!