Pages crawled by Adbeat every day
Ads detected, analyzed, and processed every day
Advertisers covered by Adbeat’s crawlers
Data Collection — Our daily workflow involves large scale analytics processing of ads, by publisher, advertiser, network, and ad type/size.
Adbeat leverages the power of cloud-based workflows to perform over 2 billion calculations each week on freshly crawled advertising data. During every hour of every day, Adbeat has hundreds of servers deployed to crawl the web on a massive scale. As the crawlers visit the most important, highly trafficked publisher web sites we cover, they detect millions of ads and send back billions of important data points.
The data is collected hourly and daily allowing for precise, granular analysis. Extracting, analyzing, and processing all of this data while it’s still fresh is no easy task-- that’s why our competitors don’t do it. Rather than rely on a small, inferior data-set like they do, we challenged our engineering team (made up of MIT data scientists and programmers) to solve the problem.
The data is collected at hourly and daily resolutions allowing for precise, granular analysis in later stages.
Quality & Accuracy — Saying that we’re faced with the impossible task of taming a firehose of advertising data is a pretty good excuse for letting the quality and accuracy of our data slip a bit. Nevertheless we’ve invested heavily in proprietary technologies to insure our data is as pristine as possible. Take for example, the machine-learning we employ to determine which of our human curators is selecting the best categories for an advertiser; or the custom web browser that lets our engineers “see” a web page just as our crawlers do, with all of the ad units and associated data-points highlighted. Technologies like these help to guarantee our data meets our own stringent standards for quality and accuracy.
Starting with Adbeat’s sample data of millions of advertisers, publishers, and ads gathered over 7 years, they created an algorithm to estimate ad spend down to the dollar amount
Ad Spend — To determine accurate ad spend estimates, Adbeat’s data scientists developed a proprietary methodology. Starting with Adbeat’s sample data of millions of advertisers, publishers, and ads gathered over 7 years, they created an algorithm to estimate ad spend down to the dollar amount on display and native channels.
Keyword Extraction & OCR — Display advertising is constantly evolving and new technologies in the form of rich media ads are continually being launched and tested. Once you venture beyond static banners, existing off-the-shelf OCR solutions fail to properly parse keywords and phrases. To do this properly, we developed our unique AccuAd™ technology. Developed with the help of machine-vision experts, AccuAd™ applies state-of-the-art Optical Character Recognition (OCR) and domain-specific filtering to extract usable content from even the most complex ad formats including: HTML5 ads, page take-overs, interstitials, and a plethora of proprietary native ad widgets. Sophisticated natural language processing (NLP) and information retrieval techniques are then applied to extract the most relevant terms for each advertiser.
Each ad tag is parsed and routed to powerful algorithms specially trained to understand and identify a complex web of ad networks, ad servers, DSPs, and SSPs
Ad Tag Extraction — Further down the processing pipeline we turn our computing resources to reverse engineer the exact path the ad took to before it’s seen by a visitor. Each ad tag is parsed and routed to powerful algorithms specially trained to understand and identify a complex web of ad networks, ad servers, DSPs, and SSPs. The resulting data gives the deepest insights possible into how inventory is flowing between what can be dozens of middlemen.
Similarity & Recommendation Engines — Our processing of data doesn’t end with simply extracting key data points and metrics. Adbeat also works to create new data that powers our world-class toolset. For example, to support our Similarity & Recommendation Engines we developed and tuned algorithms specifically for the display and native advertising space. The algorithms use ad text, landing page text, traffic and advertising activity to cluster like advertisers and publishers; from there we generate lists of similar advertisers and publishers and tap those to offer recommendations. The result is the stunning ability to identify and present similar advertisers that are often impossible to discover by any other means.
Categorization Engine — To categorize advertisers quickly with a high degree of accuracy, our engineers augmented our human curation team with machine-learning algorithms. This two-pronged approach yields excellent categorization for advertisers across all categories and industry verticals in which they operate -- allowing our customers to discover new advertisers in categories of interest and create reports limited to a single vertical.
Once the data is collected and processed, we turn to our presentation and insights layer.
Our innovative dashboard system allows you to view, sort and filter data in ways that will provide the most valuable insights for your media strategy.
This part of the platform is what our customers see and interact with everyday — so as fans of well designed software — making the Adbeat UI look as good and work as well as the data that backs it is our passion.
User Experience — To achieve an excellent user experience we started with a crisp, modern, intuitive design and then focused on making it as snappy and responsive as possible. To handle the large volume of data being searched, accessed, and retrieved, our engineers built the platform atop the largest and fastest server hardware available in the cloud then fully load-balanced them for robustness. They continued by performing critical rendering path optimization across the entire application to ensure your web browser presents data on-screen as fast as possible. The result is lightning-fast search and data retrieval that doesn’t slow down your workflow. Adbeat keeps up with how fast your ideas are flowing.
Data Presented in Meaningful Ways — The Adbeat platform is made up of thoughtfully designed dashboards and panels that allow you to make sense of the staggering amount of advertising data we collect. Our innovative dashboard system allows you to view, sort and filter data in ways that will provide the most valuable insights for your media strategy.
Data Freshness — In the fast-moving world of advertising, timely data is critical. Our competitors are content to wait as long as 10 days between updates — Adbeat leads the industry and issues updates every 48 hours to insure you have the freshest data possible. To perform the computation necessary to update our massive indexes every-other-day, we combine highly optimized workflows and massively paralleled computing resources. More specifically, 24 fire-breathing slave servers, each with 32 CPUs, work in concert to process over 344 million combinations of advertiser/country/platform/network/publisher and 460 million combinations of ad/country/platform/network/publisher.
A real person will respond within 24 hours.