灰色预测模型对准确推荐在存在数据稀疏和相关性_第1页
灰色预测模型对准确推荐在存在数据稀疏和相关性_第2页
灰色预测模型对准确推荐在存在数据稀疏和相关性_第3页
灰色预测模型对准确推荐在存在数据稀疏和相关性_第4页
灰色预测模型对准确推荐在存在数据稀疏和相关性_第5页
已阅读5页,还剩2页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、clustering CF models Bayesian belief nets (BNs) CF modelsMarkov decision process based (MDP-based) CF modelslatent semantic CF models利用降维技术来处理数据稀疏性的问题(SVD) 丢失关键数据本论文中是利用simplest method(Cosine Distance measurement method),来处理原理:we do not directly use the exact value of the similarities, but ratherran

2、k the items according to their similarities可以应用的领域:such as nance 23, integrated circuit industry 24, the market for air travel 25, and underground pressure for working surface26.实验的数据集:Movie Lens and Each Movie论文构架:2部分是对传统CF方法的描述,基于CF (ICF) methods,对存在问题的描述,本人的贡献 3部分详细描述了基于算法提出的GF模型 4部分描述了实验的研究,包括实验

3、的数据集,评估的度量,方法,实验的分析,总结和将来的工作。2.主要的工作可以分为两个大的部分:1.相似度的测量和2.评分的预测 相似度的测量方法: 1.In ICF meth-ods, the similarity sðix; iyÞ between the items ix, and iyis determinedby the users who have rated both the items 2.最流行的方法:余弦距离和皮尔逊相关,运算原理:let I be the set of all items rated by both the users ux, and u

4、y, and let U be the set ofall users who have rated the items ix, and iy 例如:题目:(item set I)是Bread and Milk ,d is equal to the size of set I. In this case, d is equal to two(d=2) Cake (ix) and Milk (iy) are rated by both Alice and Lucy (user set U) 2.相似度的计算方法2.1.1 余弦距离 用来计算两个向量之间的相似度For UCF, the simil

5、arity between two users with Cosine Distancemethod can be calculated as follows:Cosine Distance用户之间的相似度是用户ux和用户uy对项目i的评分,For ICF 则为:物品之间的相似度是用户u对项目ix,iy的评分2.1.2 皮尔逊相关系数在相似度的计算过程中,消除评分相关性,可以利用平均评分来消除,皮尔逊相关系数在一定的程度上提高了相似度计算的准确度,对于用户之间的相似度计算如下:是用户对所有电影评分的平均值对于物品的计算则如下:2.2 评分预测思路:The k Nearest Neighbors

6、 (KNN) method 37 is usually used for prediction by weighting the sum of the ratings that similar users give to the target item or the ratings of the active user on similar items depending on whether UCF or ICF is used2.2.1 用户之间 思想: is based on the basic assumption that people who share similar past

7、preferences will be interested in similar items. 算法步骤: rst, the similarities between the users are computed using similarity measurementmethods introduced in Section 2.1; then, the prediction for the active user is determined by taking the weighted average of all the ratings of the similar users for

8、 a certain item 37 according to the formula in Eq. (5); nally, the items with the highest pre-dicted ratings will be recommended to the userwhere U(ux) denotes the set of users similar to the user ux, and pux ;iis the prediction for the user ux on item i2.2.2 物品之间思想:algorithm recommends items to use

9、rs that are similar to the items that they have already consumed. Similarly, after cal-culating the similarities between the items,where I(ix) denotes the set of similar items of item ix. Further, pu;ix denotes the prediction of user u on item ix.2.3 问题分析从数据的稀疏性和数据的相关性不用数据本身,即就是对数据本身排序使用本文的重点:we onl

10、y rank the items accord-ing to the similarity. 用户的相似度和物品相似度,不是去利用计算出来的相似度,而是利用计算出来的排序 相似度本身存在误差Then, to generate the prediction of the active user u on item i, the k most similar items that have been rated by the active user on item i are selected. Finally, we use these items as the input to build a

11、 GF model and predict the rating of the active user u on item i. If the user u does not rate k items, a xed value will be used to complete the k ratings. Empirically, the xed value can be the median value of the rating scale. For example, when the rating scale is 15, the number 3 is selected as the

12、xed value. The proposed method provides the following three main contributions:优点:1. Overcoming data sparsity2. Beneting from data correlation3. Obtaining accurate predictions.3. 提出算法 思想:ratings of similar users for a target item or ratings of the active user for similar items togenerate prediction。

13、In this paper, the GF model is used for rating prediction. It involves two steps: rating preprocess-ing and rating prediction.3.1. Rating preprocessing利用物品之间的相似度来产生评分的预测,算法步骤:First, for simplicity, the Cosine Distance method is utilized to compute the similarity between two items. Then, an m m simil

14、arity matrix is generated, where m is the number of items. If we want to predict the unrated entry of the user u on item i in the rating matrix, the k most similar items to the item i that have been rated by the user u are selected. Note that when the user u does not rate k items, the xed value with

15、 the lowest similarity will be used to complete the k ratings. Finally, the k ratings are sorted according to their incremental similarities to the item i to produce a rating sequence. In the next step, the pro-posed algorithm inputs the rating sequence to the GF model and forecasts the rating that

16、the user u will give to item i.计算出物品之间的思想度后,把物品之间的相似度排序(降序),当K最近邻物品数,用最低的评分数来替代如:原本为(4, 3, 5).当K=7时,则为(3, 3, 5, 4, 4, 3, 5)题目:用余弦相似度(系数矩阵)计算得出与i1相似的物品为10,6,2,8,4,9,3,7,5(降序)(5,7,3,9,4)如果k=3 则为3,9,4,之所以这么选择是因为他们被用户u3评论,u3评分过的只有3,4,5,7,9,若K=7,评分为(3, 3, 5, 4, 4, 3, 5),,因为评过的只有5个,给出的评分为5,4,4,3,5,剩余的两个用最

17、低分填充,3,3,5,4,4,3,5.出现用户之间相同的随机性,则以评分随机为准则1. 按相关性的降低排序,使得这与物品间的相似性比预测评分更有效2. 只选取K个最高相似的,所以更精确3.2 评分预测为什么用灰预测模型:mainly focuses on model uncertainty and information insufficiency when analyzing and understanding systems via research on conditional analysis, prediction, and decision making. A recommende

18、r system can be considered as a grey system; further, with our algorithm,the GF model is used to yield the rating prediction. The GF model utilizes accumulated generation operations to build differential equations, which benefit from the data correlations. Meanwhile,it has another significant charac

19、teristic wherein it requires less data so it can overcome the data sparsity problem. The rating sequence generated in the rating preprocessing stage is the only input required for model construction and subsequent forecasting.步骤1:设定原始的评分序列为:K为最近邻物品序列步骤2:是通过的如下累加生成:这一步是最为重要的,例如:For example, is a users original rating sequence. Obviously, the sequence does not have a clear regularity.If AGO is applied to this sequence, is obtained which has a clear growing tendency.步骤3,:灰色导数和背景灰色号码是近似的线性回归,光滑离散函数。一个灰色微分模型GM(1.1)定义如下:a,b 分别为系数,a为灰色发展系数,b为

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

最新文档

评论

0/150

提交评论