原文出處AI千集
Netflix Technology BlogFollow
Dec 8, 2017 · 13 min read
By Ashok Chandrashekar, Fernando Amat, Justin Basilico and Tony Jebara
For many years, the main goal of the Netflix personalized recommendation system has been to get the right titles in front each of our members at the right time. With a catalog spanning thousands of titles and a diverse member base spanning over a hundred million accounts, recommending the titles that are just right for each member is crucial. But the job of recommendation does not end there. Why should you care about any particular title we recommend? What can we say about a new and unfamiliar title that will pique your interest? How do we convince you that a title is worth watching? Answering these questions is critical in helping our members discover great content, especially for unfamiliar titles. One avenue to address this challenge is to consider the artwork or imagery we use to portray the titles. If the artwork representing a title captures something compelling to you, then it acts as a gateway into that title and gives you some visual “evidence” for why the title might be good for you. The artwork may highlight an actor that you recognize, capture an exciting moment like a car chase, or contain a dramatic scene that conveys the essence of a movie or TV show. If we present that perfect image on your homepage (and as they say: an image is worth a thousand words), then maybe, just maybe, you will give it a try. =This is yet another way Netflix differs from traditional media offerings: we don’t have one product but over a 100 million different products with one for each of our members with ==personalized recommendations== and ==personalized visuals==.=
多年來,Netflix 個性化推薦系統的主要目標,是為用戶在合適的時間推薦合適的視頻。Nteflix 網站上每個分類頁面下有成千上萬部影片,用戶賬號達數十億,因此為每個成員推薦合適的視頻至關重要。但推薦系統能做到的不僅是這些。怎樣讓用戶對你推薦的視頻感興趣?怎樣讓一個陌生的視頻激起用戶的興趣?什么樣的視頻值得關注?回答這些問題對于幫助用戶發現好的內容至關重要,特別是對于不熟悉的視頻。
視頻的封面:為視頻設計獨立的海報或圖像,是可以輕松地解決這個問題的方法之一。如果一張封面對用戶有足夠的吸引力,比如用戶熟悉的演員、讓人腎上腺激素飆升的汽車追逐場面,或者一部電影或電視節目精髓的戲劇性場景等信息(一張圖片勝過千言萬語),就會誘惑用戶點開視頻。這是 Netflix 與傳統媒體產品不同的一點:我們提供的不是一個產品,而是一個千人千面的產品。就算一億個用戶進來,看到的也完全不同,我們為每個用戶提供個性化推薦和個性化的視覺效果。
A Netflix homepage without artwork. This is how historically our recommendation algorithms viewed a page.
沒有封面圖的主頁。過去,頁面上推薦算法的效果
In previous work, we discussed an effort to find the single perfect artwork for each title across all our members. Through multi-armed bandit algorithms, we hunted for the best artwork for a title, say Stranger Things, that would earn the most plays from the largest fraction of our members. However, given the enormous diversity in taste and preferences, wouldn’t it be better if we could find the best artwork for each of our members to highlight the aspects of a title that are specifically relevant to them?
之前,我們討論過如何做到為所有會員的視頻匹配最合適的封面。通過多臂老虎機算法,我們可以為視頻找到最合適的封面,以《怪奇物語》為例,這部影片獲得了最高用戶播放率。但是,鑒于用戶的品味和偏好存在巨大差異,如果我們能夠找到每個用戶偏好的點,并在封面圖中能呈現出他們最感興趣的東西,效果不是更好嗎?
Artwork for Stranger Things that each receive over 5% of impressions from our personalization algorithm. Different images cover a breadth of themes in the show to go beyond what any single image portrays.
增加海報后,新的視頻通過個性化算法,得到了 5% 的提升。不同的圖像涵蓋了節目中的不同主題
As inspiration, let us explore scenarios where personalization of artwork would be meaningful. Consider the following examples where different members have different viewing histories. On the left are three titles a member watched in the past. To the right of the arrow is the artwork that a member would get for a particular movie that we recommend for them.
我們探討一下封面個性化在哪些場景下具有重要意義。例如,每個用戶有不同的觀看歷史,下圖左是三個用戶過去看過的視頻,箭頭右側是我們為會員推薦的頗受歡迎的電影。
Let us consider trying to personalize the image we use to depict the movie Good Will Hunting. Here we might personalize this decision based on how much a member prefers different genres and themes. Someone who has watched many romantic movies may be interested in Good Will Hunting if we show the artwork containing Matt Damon and Minnie Driver, whereas, a member who has watched many comedies might be drawn to the movie if we use the artwork containing Robin Williams, a well-known comedian.
我們為電影《心靈捕手》設計個性化封面的根據是每個用戶對不同類型和主題的偏好。對于看過許多浪漫愛情電影的人,如果他的推薦圖片中包含馬特·達蒙(Matt Damon)和米妮·司各德(Minnie Driver)的信息,可能他會對《心靈捕手》感興趣,而如果是對于看過很多喜劇片的用戶,我們在推薦圖中包含知名喜劇演員羅賓·威廉斯(Robin Williams)的信息,吸引他的幾率可能更大。
In another scenario, let’s imagine how the different preferences for cast members might influence the personalization of the artwork for the movie Pulp Fiction. A member who watches many movies featuring Uma Thurman would likely respond positively to the artwork for Pulp Fiction that contains Uma. Meanwhile, a fan of John Travolta may be more interested in watching Pulp Fiction if the artwork features John.
另外,個性化封面對喜歡不同演員的用戶會產生什么影響呢?以《低俗小說》為例,一位觀看過很多烏瑪·瑟曼(Uma Thurman)出演電影的用戶可能會對包含烏瑪(Uma)信息的圖片反應更為積極。同理,John Travolta 的粉絲更可能因為圖像中包含 John 而被這部影片吸引。
Of course, not all the scenarios for personalizing artwork are this clear and obvious. So we don’t enumerate such hand-derived rules but instead rely on the data to tell us what signals to use. Overall, by personalizing artwork we help each title put its best foot forward for every member and thus improve our member experience.
當然,并不是所有的封面個性化場景都是這么明了的。所以我們并沒有窮舉這些規則,而是依靠數據來告訴我們應該使用什么圖片??傮w而言,通過封面個性化處理,我們可以幫助提高每個用戶的體驗。
Challenges 挑戰
At Netflix, we embrace personalization and algorithmically adapt many aspects of our member experience, including the rows we select for the homepage, the titles we select for those rows, the galleries we display, the messages we send, and so forth. Each new aspect that we personalize has unique challenges; personalizing the artwork we display is no exception and presents different personalization challenges. One challenge of image personalization is that we can only select a single piece of artwork to represent each title in each place we present it. In contrast, typical recommendation settings let us present multiple selections to a member where we can subsequently learn about their preferences from the item a member selects. This means that image selection is a chicken-and-egg problem operating in a closed loop: if a member plays a title it can only come from the image that we decided to present to that member. What we seek to understand is when presenting a specific piece of artwork for a title influenced a member to play (or not to play) a title and when a member would have played a title (or not) regardless of which image we presented. Therefore artwork personalization sits on top of the traditional recommendation problem and the algorithms need to work in conjunction with each other. Of course, to properly learn how to personalize artwork we need to collect a lot of data to find signals that indicate when one piece of artwork is significantly better for a member.
Netflix 還通過算法對網站做了很多個性化處理,以提高會員體驗,包括主頁列表選擇、列表的標題、展示的圖片、發送的消息等等。對于我們來說,每一個方面的個性化處理都是獨特的挑戰,個性化封面也不例外。其中,圖像個性化處理的挑戰之一,是每個位置視頻的封面只能有一張。相比之下,典型的推薦設置可以向會員提供多個選擇,之后我們可以從會員的選擇中了解他們的偏好。這就是個先有雞還是先有蛋的問題。會員到底是因為個性化封面吸引他,點擊的這個視頻,還是因為本來就想看這個視頻,和封面無關。因此,個性化封面推薦應該結合傳統方法與算法才能奏效。當然,為了正確學習封面個性化,我們需要收集大量的數據,來找到能表明哪個封面對于用戶更合適的信息。
Another challenge is to understand the impact of changing artwork that we show a member for a title between sessions. Does changing artwork reduce recognizability of the title and make it difficult to visually locate the title again, for example if the member thought was interested before but had not yet watched it? Or, does changing the artwork itself lead the member to reconsider it due to an improved selection? Clearly, if we find better artwork to present to a member we should probably use it; but continuous changes can also confuse people. Changing images also introduces an attribution problem as it becomes unclear which image led a member to be interested in a title.
另一個挑戰,是要理解封面變化所產生的影響,是否會降低視頻的可識別性,讓視頻在視覺上難以定位?例如,會員之前感興趣但至今還沒有注意到的視頻,或者,封面改變是否會讓用戶改變想法。如果我們找到更好的圖片呈現給會員并不斷更換封面,會讓會員感到迷惑。另外,改變封面也會引起一個問題,我們不清楚究竟是哪張封面引起了會員對視頻的興趣。
Next, there is the challenge of understanding how artwork performs in relation to other artwork we select in the same page or session. Maybe a bold close-up of the main character works for a title on a page because it stands out compared to the other artwork. But if every title had a similar image then the page as a whole may not seem as compelling. Looking at each piece of artwork in isolation may not be enough and we need to think about how to select a diverse set of images across titles on a page and across a session. Beyond the artwork for other titles, the effectiveness of the artwork for a title may depend on what other types of evidence and assets (e.g. synopses, trailers, etc.) we also display for that title. Thus, we may need a diverse selection where each can highlight complementary aspects of a title that may be compelling to a member.
接下來,是要理解封面如何與同一個頁面或者階段選擇的其他封面進行合理關聯。也許主角的大膽特寫非常適用于頁面上的視頻封面,因為與其他作品相比,它顯得非常突出。但是,如果整個頁面的封面都是這一類型,那么它的效果反而會大打折扣。因此,孤立地看每一幅圖片可能還不夠,我們需要思考如何在整個頁面使用多樣化的圖像。封面的效果可能還取決于圖片之外其他的因素(例如簡介、預告片等)。所以,我們的圖片選擇應該多樣化,讓每個視頻之間都能形成互補。
To achieve effective personalization, we also need a good pool of artwork for each title. This means that we need several assets where each is engaging, informative and representative of a title to avoid “clickbait”. The set of images for a title also needs to be diverse enough to cover a wide potential audience interested in different aspects of the content. After all, how engaging and informative a piece of artwork is truly depends on the individual seeing it. Therefore, we need to have artwork that highlights not only different themes in a title but also different aesthetics. Our teams of artists and designers strive to create images that are diverse across many dimensions. They also take into consideration the personalization algorithms which will select the images during their creative process for generating artwork.
為了實現有效的個性化,我們還需要為每個視頻提供優質的作品庫。這意味著我們需要多個庫存,并且每個庫存的圖片都是非常有吸引力、信息豐富且與視頻契合,但要避免那種“誘導點擊”式的圖片。視頻的圖像集也需要足夠多樣化,以涵蓋對內容的不同角度感興趣的廣大潛在觀眾。畢竟,一張封面的信息量取決于看到它的個體。因此,我們的封面不僅需要突出視頻中的不同主題,還要突出不同的美學。
Finally, there are engineering challenges to personalize artwork at scale. One challenge is that our member experience is very visual and thus contains a lot of imagery. So using personalized selection for each asset means handling a peak of over 20 million requests per second with low latency. Such a system must be robust: failing to properly render the artwork in our UI brings a significantly degrades the experience. Our personalization algorithm also needs to respond quickly when a title launches, which means rapidly learning to personalize in a cold-start situation. Then, after launch, the algorithm must continuously adapt as the effectiveness of artwork may change over time as both the title evolves through its life cycle and member tastes evolve.
最后,是大規模個性化封面面臨的工程挑戰。由于我們的會員體驗是視覺化的,包含大量的圖像,因此,系統在峰值時需要每秒處理超過 2000 萬個低延遲請求。這個系統必須足夠強大,因為用戶界面不能正確渲染圖稿,用戶體驗會顯著下降。而且,個性化算法還需要在視頻上傳時做出快速響應,這意味著要在冷啟動的情況下快速個性化學習。啟動后,該算法必須不斷進行調試,因為封面的效果可能會隨著時間的推移而變化,視頻的生命周期不斷演變,而且會員的品味也在不斷變化。
Contextual bandits approach
Much of the Netflix recommendation engine is powered by machine learning algorithms. Traditionally, we collect a batch of data on how our members use the service. Then we run a new machine learning algorithm on this batch of data. Next we test this new algorithm against the current production system through an A/B test. An A/B test helps us see if the new algorithm is better than our current production system by trying it out on a random subset of members. Members in group A get the current production experience while members in group B get the new algorithm. If members in group B have higher engagement with Netflix, then we roll-out the new algorithm to the entire member population. Unfortunately, this batch approach incurs regret: many members over a long period of time did not benefit from the better experience. This is illustrated in the figure below.
Netflix 的大部分推薦引擎都采用機器學習算法。首先,我們會收集一批關于會員如何使用服務的數據,然后在這批數據上運行一個新的機器學習算法。接下來,我們對這種算法在現有生產系統上進行 A / B 測試。通過在隨機子集上進行 A / B 測試,我們了解到新算法是否比現有的生產系統更好。A 組會員代表當前的產品體驗,而 B 組代表新算法下的產品體驗。如果 B 組中的會員對 Netflix 的參與度更高,那么我們將把這個新算法推廣到整個會員群體。不幸的是,這種批處理方式也有缺憾(regret):許多會員長期以來并沒有更好的用戶體驗,如下圖所示:
To reduce this regret, we move away from batch machine learning and consider online machine learning. For artwork personalization, the specific online learning framework we use is contextual bandits. Rather than waiting to collect a full batch of data, waiting to learn a model, and then waiting for an A/B test to conclude, contextual bandits rapidly figure out the optimal personalized artwork selection for a title for each member and context. Briefly, contextual bandits are a class of online learning algorithms that trade off the cost of gathering training data required for learning an unbiased model on an ongoing basis with the benefits of applying the learned model to each member context. In our previous unpersonalized image selection work, we used non-contextual bandits where we found the winning image regardless of the context. For personalization, the member is the context as we expect different members to respond differently to the images.
為了減小這個缺憾,我們放棄了批處理機器學習,而使用在線機器學習。對于圖片個性化,我們使用的在線學習框架是 contextual bandits。contextual bandits 并不是收集整批的數據,進行學習模型訓練,直到 A / B 測試結束,而是可以迅速為每個會員找到最合適的個性化圖片。簡而言之,contextual bandits 是一類在線學習算法,這種算法可以在學習無偏差模型所需的訓練數據成本,和將學習模型應用于每個會員的好處之間進行權衡。在之前的工作中,我們使用非 contextual bandits 方法進行封面選擇,找到內容上最佳的圖像。而對于個性化推薦,我們要考慮上下文,因為我們預計不同的會員會對圖像做出不同的反應。
A key property of contextual bandits is that they are designed to minimize regret. At a high level, the training data for a contextual bandit is obtained through the injection of controlled randomization in the learned model’s predictions. The randomization schemes can vary in complexity from simple epsilon-greedy formulations with uniform randomness to closed loop schemes that adaptively vary the degree of randomization as a function of model uncertainty. We broadly refer to this process as data exploration. The number of candidate artworks that are available for a title along with the size of the overall population for which the system will be deployed informs the choice of the data exploration strategy. With such exploration, we need to log information about the randomization for each artwork selection. This logging allows us to correct for skewed selection propensities and thereby perform offline model evaluation in an unbiased fashion, as described later.
contextual bandits 的一個重要屬性,是其是為盡量減小缺憾而設計的。在高層次上,我們通過在學習模型的預測中輸入受控隨機化來獲得 contextual bandits 的訓練數據。隨機化方案的復雜性可以從簡單的具有均勻隨機性的 epsilon-greedy 公式,到隨著模型不確定性而自適應地改變隨機化程度的閉環方案。我們將這個過程稱為數據探索(data exploration)。進行這樣的探索,我們需要記錄每個封面選擇的隨機化信息。這種日志記錄讓我們可以糾正走偏的選擇傾向,從而以稍后所述的不偏頗的方式執行離線模型評估。
Exploration in contextual bandits typically has a cost (or regret) due to the fact that our artwork selection in a member session may not use the predicted best image for that session. What impact does this randomization have on the member experience (and consequently on our metrics)? With over a hundred millions members, the regret incurred by exploration is typically very small and is amortized across our large member base with each member implicitly helping provide feedback on artwork for a small portion of the catalog. This makes the cost of exploration per member negligible, which is an important consideration when choosing contextual bandits to drive a key aspect of our member experience. Randomization and exploration with contextual bandits would be less suitable if the cost of exploration were high.
Under our online exploration scheme, we obtain a training dataset that records, for each (member, title, image) tuple, whether that selection resulted in a play of the title or not. Furthermore, we can control the exploration such that artwork selections do not change too often. This gives a cleaner attribution of the member’s engagement to specific artwork. We also carefully determine the label for each observation by looking at the quality of engagement to avoid learning a model that recommends “clickbait” images: ones that entice a member to start playing but ultimately result in low-quality engagement.
由于我們可能不會采用情境 bandits 算法預測的最佳圖像,所以數據探索可能會產生成本(或缺憾)。這種隨機性對會員體驗(以及我們的指標)有什么影響呢?我們有超過一億的會員,通常情況下,探索帶來的缺憾非常小,分攤到龐大的會員基數上,每個會員都會為記錄提供一小部分反饋。這使得每個成員的探索成本可以忽略不計,這也是起碼選擇情境 bandits 改善會員體驗的重要因素。如果探索成本很高,那么使用情境 bandits 進行隨機化和數據探索就不太合適。根據我們的在線數據探索方案,不管視頻是否被播放,我們都會獲得一個記錄每個(會員、標題、圖像)元組的訓練數據集。此外,我們可以控制探索,使圖像選擇不會經常變化,這使得會員對特定圖片的參與度更加清晰。
Model training 模型訓練
In this online learning setting, we train our contextual bandit model to select the best artwork for each member based on their context. We typically have up to a few dozen candidate artwork images per title. To learn the selection model, we can consider a simplification of the problem by ranking images for a member independently across titles. Even with this simplification we can still learn member image preferences across titles because, for every image candidate, we have some members who were presented with it and engaged with the title and some members who were presented with it and did not engage. These preferences can be modeled to predict for each (member, title, image) tuple, the probability that the member will enjoy a quality engagement. These can be supervised learning models or contextual bandit counterparts with Thompson Sampling, LinUCB, or Bayesian methods that intelligently balance making the best prediction with data exploration.
在在線學習中,我們訓練 contextual bandits 模型根據情境為每個會員選擇最合適的圖片。通常每個視頻最多有幾十張候選圖片,為了訓練選擇模型,我們為每個會員的圖片進行排名來簡化問題。簡化之后,我們仍然可以找到會員對視頻圖像的偏好,因為呈獻給用戶的每個候選圖像,有一部分會引起用戶的參與,而另一部分則不會。我們可以對這些偏好進行建模和預測,會員享受高質量參與度的概率會相應提高。這樣的模型可以是監督式學習,也可以是湯普森抽樣(Thompson Sampling)contextual bandits、LinUCB 或貝葉斯方法(Bayesian)。
Potential signals 潛在的信息
In contextual bandits, the context is usually represented as an feature vector provided as input to the model. There are many signals we can use as features for this problem. In particular, we can consider many attributes of the member: the titles they’ve played, the genre of the titles, interactions of the member with the specific title, their country, their language preferences, the device that the member is using, the time of day and the day of week. Since our algorithm selects images in conjunction with our personalized recommendation engine, we can also use signals regarding what our various recommendation algorithms think of the title, irrespective of what image is used to represent it.
在 contextual bandits 中,contextual 通常表示為模型輸入提供的特征向量。我們可以使用許多信息作為特征,尤其是會員的許多屬性:他們播放的視頻、視頻類型、會員對特定視頻的參與度、國籍、語言偏好、使用設備、時間等。
An important consideration is that some images are naturally better than others in the candidate pool. We observe the overall take rates for all the images in our data exploration, which is simply the number of quality plays divided by the number of impressions. Our previous work on unpersonalized artwork selection used overall differences in take rates to determine the single best image to select for a whole population. In our new contextual personalized model, the overall take rates are still important and personalization still recovers selections that agree on average with the unpersonalized model’s ranking.
另外一個重要的考慮因素,是候選池中一些圖片優于其他圖片。我們觀察數據探索中所有圖像的總體轉換率(take rates),即高質量播放次數除以印象數量。以前做非個性化圖像選擇時,我們僅根據總體轉換率之間的差異來決定為用戶批量選擇的最佳圖像。而在我們新的情境 bandits 個性化模型中,整體轉換了仍然是重要的,并且個性化推薦仍會與非個性化圖像排名有一定重合。
Image Selection 圖像選擇
The optimal assignment of image artwork to a member is a selection problem to find the best candidate image from a title’s pool of available images. Once the model is trained as above, we use it to rank the images for each context. The model predicts the probability of play for a given image in a given a member context. We sort a candidate set of images by these probabilities and pick the one with the highest probability. That is the image we present to that particular member.
為會員提供合適圖像,實際上是一個從與視頻匹配的的可用圖像池中找到最佳候選圖像的選擇性問題。模型經過上述訓練后,我們用它來對每個情境的圖像進行排序,并預測為會員推薦圖像會引發播放的概率。我們按這些概率對候選圖像集進行排序,并選擇出概率最高的圖像。
Performance evaluation 效果評估
Offline 離線學習
To evaluate our contextual bandit algorithms prior to deploying them online on real members, we can use an offline technique known as replay [1]. This method allows us to answer counterfactual questions based on the logged exploration data (Figure 1). In other words, we can compare offline what would have happened in historical sessions under different scenarios if we had used different algorithms in an unbiased way.
在線上部署之前,我們可以使用一種稱為“重播”的離線技術 [1] 對情境 bandits 算法進行評估。這種方法讓我們可以根據記錄的探索數據來回答反事實問題(圖 1)。換句話說,如果我們在同等條件下使用不同的算法,在不同情境下在線下會發生什么。
Figure 1: Simple example of calculating a replay metric from logged data. For each member, a random image was assigned (top row). The system logged the impression and whether the profile played the title (green circle) or not (red circle). The replay metric for a new model is calculated by matching the profiles where the random assignment and the model assignment are the same (black square) and computing the take fraction over that subset.
(圖 1:根據記錄的數據計算重播率的簡單示例。為每個成員分配一個隨機圖像(第一行),系統記錄了視頻印象以及用戶播放了視頻(綠色圓圈)或沒有(紅色圓圈)。通過匹配隨機分配和模型分配重合的部分(黑色方塊),計算該子集的分數來計算新模型的重播指數。)
Replay allows us to see how members would have engaged with our titles if we had hypothetically presented images that were selected through a new algorithm rather than the algorithm used in production. For images, we are interested in several metrics, particularly the take fraction, as described above.
如果我們假設提供的圖像是通過新算法選擇的,而不是現用的算法,則重播顯示出會員對視頻的參與度。圖 2 顯示了與隨機選擇或非情境 bandits 相比,情境 bandits 如何提高記錄中用戶的平均參與率。
Figure 2 shows how contextual bandit approach helps increase the average take fraction across the catalog compared to random selection or non-contextual bandits.
(圖 2:基于圖像探索數據記錄中重播率,不同算法選擇的圖像平均分數(越高越好)。隨機(綠色)表示隨機選擇圖像,簡單的 Bandit 算法(黃色)選擇具有最高分數的圖像。情境 bandits 算法(藍色和粉紅色)根據情境為不同的成員選擇不同的圖像。)
Figure 2: Average image take fraction (the higher the better) for different algorithms based on replay from logged image explore data. The Random (green) policy selects one image at random. The simple Bandit algorithm (yellow) selects the image with highest take fraction. Contextual Bandit algorithms (blue and pink) use context to select different images for different members.
Figure 3: Example of contextual image selection based on the type of profile. Comedy refers to a profile that mostly watches comedy titles. Similarly, Romance watches mostly romantic titles. The contextual bandit selects the image of Robin Williams, a famous comedian, for comedy-inclined profiles while selecting an image of a kissing couple for profiles more inclined towards romance.
(圖 3:根據用戶個人資料進行的情境圖像選擇示例。Comedy 指主要觀看喜劇片的個人資料,Romance 代表看愛情片最多的用戶個人資料。情境 bandits 算法為更喜歡喜劇片的會員推薦了帶有著名喜劇演員羅賓·威廉姆斯(Robin Williams)形象,同時更為浪漫的情侶接吻圖片。)
Online 在線學習
After experimenting with many different models offline and finding ones that had a substantial increase in replay, we ultimately ran an A/B test to compare the most promising personalized contextual bandits against unpersonalized bandits. As we suspected, the personalization worked and generated a significant lift in our core metrics. We also saw a reasonable correlation between what we measured offline in replay and what we saw online with the models. The online results also produced some interesting insights. For example, the improvement of personalization was larger in cases where the member had no prior interaction with the title. This makes sense because we would expect that the artwork would be more important to someone when a title is less familiar.
經過對多種離線模型進行試驗之后,我們找到了可以提高重播率的模型,最后進行 A / B 測試,以對個性化情境 bandits 與非個性化 bandits 進行比較。正如我們所料,個性化對核心指標提高起到了重大的作用。我們也看到了線下測量重播率與線上模型之間的合理性關聯。在線結果還發現了有趣的現象,例如,在會員之前沒有參與的視頻,個性化的改善效果更好。這不無理由,因為我們更希望這個算法對用戶并不熟悉的視頻發揮更大的作用。
Conclusion 結論
With this approach, we’ve taken our first steps in personalizing the selection of artwork for our recommendations and across our service. This has resulted in a meaningful improvement in how our members discover new content… so we’ve rolled it out to everyone! This project is the first instance of personalizing not just what we recommend but also how we recommend to our members. But there are many opportunities to expand and improve this initial approach. These opportunities include developing algorithms to handle cold-start by personalizing new images and new titles as quickly as possible, for example by using techniques from computer vision. Another opportunity is extending this personalization approach across other types of artwork we use and other evidence that describe our titles such as synopses, metadata, and trailers. There is also an even broader problem: helping artists and designers figure out what new imagery we should add to the set to make a title even more compelling and personalizable.
If these types of challenges interest you, please let us know! We are always looking for great people to join our team, and, for these types of projects, we are especially excited by candidates with machine learning and/or computer vision expertise.
現在,我們已經邁出了第一步,在個性化圖片推薦和其他服務中采用了這種方法。這改進了用戶發現新內容的方法,有史以來,我們不僅對推薦內容進行了個性化,而且對推薦的方式也進行了個性化。但是,這個方法還有很多可以改進的地方,應用的范圍也可以進一步擴大,包括通過計算機視覺技術開發能以最快的速度對圖像和視頻進行個性化處理的算法冷啟動等。另一個機會是可以將這種個性化方法擴展到我們使用的其他類型的封面以及其他視頻描述語,例如概要、元數據和預告片中。
References
[1] L. Li, W. Chu, J. Langford, and X. Wang, “Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms,” in Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, New York, NY, USA, 2011, pp. 297–306.