Oftentimes, our clients want to know how our classification team operates and how exactly we determine how content should be classified. In this installment of our “Ask the Experts” series, DoubleVerify’s (DV) CJ Morello, Sr. Director of AI Product Operations, explains how his team works hand-in-hand with artificial intelligence (AI) in order to classify content as accurately as possible.
Can you tell us a bit about the role of the Classification Operations teams? What does your typical day look like?
The Classification Operations team consists of AI curation specialists that annotate content that powers DV’s brand safety and suitability products. Each day, our team reviews and classifies thousands of images, social media posts, webpages and CTV and mobile apps under a variety of different content categories. We then input those classifications into DV systems and curate them as training data for algorithms that can scale up the classification to millions of pieces of content with a high degree of accuracy. We review the performance of those automated systems daily and make adjustments in collaboration with our data science, policy and product management teams to maximize classification accuracy.
How did you get into working in this field?
I worked for a few years at GoFundMe in the trust and safety and product operations space. While I loved the work, I was hungry to learn how mission-critical trust and safety business units worked at a larger scale – powered by big data and machine learning. I found a great role at Amazon where I was able to learn about classification trust and safety from amazing data scientists and leaders in the risk management space.
What do you love most about your work?
My team works with linguists, engineers, business leaders and cutting-edge technology to train and launch machine learning models that can automatically classify any kind of content, and we also ensure that there is a human in the loop at all times. We react to trends that models wouldn’t be aware of immediately, and the new challenges and opportunities that we encounter on a daily basis keep the work exciting.
What’s one misconception people often have about content classification?
I think a common misconception is that an advertiser simply detecting content they deem unsafe or unsuitable using keywords is sufficient to protecting their brand. An effective brand safety and suitability strategy requires comprehensive content classification.
DV’s comprehensive classification methodology combines our semantic science engine and machine learning and ensures that classifications represent content, context and tone of all types of content. This full contextual analysis enables brands to act with confidence in where their ads are appearing.
That’s why I value product and policy leaders on my team who have the education and passion to understand the nuances of brand safety and suitability issues and the histories behind them – both in local languages and abroad. Without this expertise, the great technology available for performing classification at scale would not be nearly as valuable.
What sort of skills help the people on your team find success in their roles?
There are two key skills that help us deliver better classification products. The first is the ability to create great processes that are easy to follow. The second is outstanding media literacy and critical thinking, which allow our human classification team to interpret the results of a classification system and understand the ‘why’ behind machine classification effortlessly. A successful product operations team can get to the bottom of any product issue quickly by instantly identifying patterns, thereby helping engineers and data scientists improve algorithms and models.
What’s the strangest or even funniest thing that you’ve seen as part of your work?
Given some of the more sensitive content that we see on a regular basis, it is important to keep a sense of humor in this field. One of the things that always surprises me, and reminds me of the creativity of bad actors, is the evolution of language use online. For example, there are a million ways to misspell a word to get around platform content moderation tools. Tactics like adding umlauts and accents to letters where they don’t belong, and combining characters from different languages make for artistic spelling conventions that look like secure passwords rather than sensitive words. For example, Pø®ñøgråp‡‡¥.
Can you tell us about some of the work that you’re most proud of?
I joined DV a few months ago, and what inspired me to come here is what originally drew me into this industry. I always believed that this line of work has inordinate value, and failing to do it well could have consequences for brands. It can be a challenge to balance the growth of the business and the safety of your employees and customers, especially when dealing with user-generated content. DV’s business, on the other hand, is directly a result of successful and scalable brand safety and fraud protection solutions, and measurement thereof. I’m proud to be part of a company where this is “the whole ballgame.”
Are there any resources that you recommend to people who want to learn more about this field and the developments of this technology?
I love reading academic research on this topic, and Stanford’s internet observatory publishes rigorously researched articles as well as a periodical on various topics in online safety and the effectiveness of products that are meant to promote trust and provide anti-harm defenses. I also enjoy “Marketplace Risk Platform Podcast,” which brings in experts from data science, risk management leaders and government bodies to give different perspectives on the social, technical and economic changes and threats that continue to evolve ever more rapidly in our online world.
If you’re interested in reading more from our classifications experts, check out Anna Zapesochini’s “Ask the Experts” blog on machine learning technology.