ժҪժҪMУYԴsĂԻOӋ˸YԴԻϵyQԻзϵy(PSRSSpersonalizedscientificresearchservicesystem)ȫ˸УÑĂԻYԴOӋ˻ڔӡں϶N]Ե
ժҪMУYԴsĂԻOӋ˸YԴԻϵyQԻзϵy(PSRSSpersonalizedscientificresearchservicesystem)ȫ˸ÑĂԻYԴOӋ˻ڔں϶N]Ե]ӋóʬFӵĶں]ϵyܘ;ڲͬ]^˲ͬ]㷨x㷨Mᘌԃ;̽ӑÑģͺͿYԴģ͵OӋF˻YԴȡĿƶÑfͬ^VTop]ϵy˸У@ȡYԴwYԴԻϵyOṩ˼·
PI~ں];Ի;]ϵy;
Ѳ“r”Nйϵyзƽ_ռ˺ĿДYԴęn[1]ϢԽԽԻĽИILԇ_lûڸN㷨ģ͵ĂԻ]ϵyAmazonͨ^ھWվʹ]ϵyÑĞg[ُIОMзMԓWվ^g[ُIОÑMЂԻ]VentureBeatĽyӋÂԻ]gʹÁRdWվN~ϣԻ]gđҲԽԽV[23]
УڿлЙzYԴĕrgռÕr50%^ÑԻҲӻs[4]Ŀǰ@ȡДͿYԴҪ߀ʹûPIֵϢԃzʽ҇ȸйϵyͿДƽ_Ĺ߀^һoMÑԻ[5
һYԴϢ^d挦ДÑs֟oߣܷݵث@ҪĿYԴ;һ棬ÑҪ֪ԼYԴ_ʾʹҪYԴFеYԴzϵyӰÑܸdȤĿYԴ]oÑʹÌFĿYԴòڴ£ijZWᘌУsӵĂԻYԴ̽ں]ĂԻYԴϵyOӋ
1P
1.1YԴԻ
ͨ^ÑMІ{YҪĂԻYԴ飺ͨ^ݔԼоn}ՓĘ}@ƵĿĿYԴԼĿṩõĎ;@îǰIPоğֵ^ߵĿYԴгɹM˽⮔ǰWƵҪо;˽WͬЮǰµоn}оӑBeܫ@һЩԼ뵽ָdȤPYԴԼоҌķͿԽb˼
1.2ں]ϵyܘ
κΆһ]ԶܝMУÑsӵĂԻ˞PSRSSOӋں϶N]Ե]ϵyܘɔں]ӡóʬFӽMɡӣɻA͌̎혋AÑϢYԴÑОԴÑҪڰÑ˻Ϣϵy;YԴҪڿС̸ĹϵyÑгɹՓоĿ;ÑО锵ÑʹPSRSSϵyrО־
̎ǏĘIՔгȡ蔵MDQϴ˜ʻں͵A̎]ṩ蔵ں]ӣԓǂԻϵyĺģڔṩĔAÑYԴĿÑcĿÑcÑĿcĿgPϵß]UserCF]ʹIFTDFĻĿݵ]㷨ϵyں]棬ԝMÑsӵĂԻ;ԓ߀ϵyٻĿM^VģK[910]óʬFӣҪͨ^ͬʽÑʬF]ĽY
2YԴ̎
PSRSSҪ惦̎ĔǼͬrڌֲʽӋ̎ҪϵyApachHadoopg܌YԴMд惦̎w̎^̣бϵyҪĔбČԡ֮gPϵԭʼ惦(RDSrawdatastores)DQĔ惦(TDStransformeddatastrores)
ͨ^HiveϽɂ팍FʹДֲ惦HDFSRDSwIϵyPSRSS֮g^Ʌ^Ա⌦ԴϵyӰ푣鼚ԃṩ֧ʹSqoopѸIϵyPȡRDSʹFlume־ļЫ@ȡÑWʹÿYԴĔDQcbdбRDSӳ䣬ں]ϵyҪʹHiveQL_MDQ̎Mȥaȫeme˜ʻ̎RDSbdTDS
״εĔȡDQbd(ETLExtractTransformLoad)^̺߀ҪϵyҪڈДETL^̣簴ÿMһԄӻETL^ ÑģͺYԴĿģ͵ĘÑͿYԴĿ֮gPFԻ]]ϵyҪ^ÑģͺͿYԴģ͡Ñ\òͬ]㷨YԴĿMٻӋÑʬFYԴ]б@ҪE[1112ÑģͺͿYԴģ͛QPSRSSݔ
2.1Ñģ
PSRSSÑģаÑϢÑYԴdȤģͣÑʹϵyYԴrһЩОϢPSRSSҪÑ]dȤĸNYԴHҪӛÑYԴĿľwО锵߀ҪӛÑʹPSRSSО锵Ñg[ijYԴĿݵľwrL@ЩО锵ĿȺÑdȤģ͵ĸ¡ڸÑʹÂԻзϵyrֻעYԴĿăݱԃg[xd@ȡ̫ĿMury@ÑYԴĿ@ʽОӛ䛡
[ʽķʽӛ䛲ÑʹPSRSSYԴО־ÑģϵyÑģʹ20o70GeraldSaltonVSM(gģVectorSpaceModel)ʾԓģ̎ęnͨ^Re@ȡęnĂPIԱʾęnքeÿxmęֵMһʾԓęn
ęnʾęngӋ㲻ͬęngƶȲ˶ęngPSRSSУÑijYԴĿijNОrОֵ飬@ЩО鷴ӳÑYԴĿIJͬdȤxÿNО鲻ͬęֵȡֵ0~1ҙֵ͞
2.2YԴģ
ûYԴĿ}ģ͵YԴ]ͨ^oܷӳYԴĿҪ}ӋֵĶʹԓӋõYԴĿgƶԱ^_Ñ]ܸdȤĿYԴУҪYԴпՓоv͙Mn}Nِɹȡ]MлڿYԴĿ]Ҫe]㷨ĿrgصĸMOӋڌYԴMнģrOӋ˰YԴĿIDYԴrgYԴLYԴ͡YԴPIYԴģ͵Ԫ
1)typeYԴĿϵy䆢AÑ]rÑČIоYԴṩʼYԴĿ]ӢZWԺоӢZĻĽ̎]͵ՓĻYԴ
2)durationYԴĿϵyаlڵĕrgYԴĿf̶]^У҂Ҫ]rg،ÑdȤȵӰ
3)lengthYԴĿLȣĿǰҪYԴĿʽҪıÑxg[ĕrgĿݵLͬQÑԓYԴĿdȤ
4)YԴĿ}PIбMлĿݵ]rʹTFIDFYԴ}ӋõԓYԴĿPIб
3㷨xc
3.1㷨xԻ]㷨ǂԻзյĻAҪ]㷨лڃ(contentbased)]څfͬ^V(collaborativefiltering)]PҎt(associationrulebased)]Ч(utilitybased)]֪R(knowledgebased)]ͽM(hybrid)]ڃݵ]ĿϢ]ҪÑĿM@ʽurͨ^ʹÙCWķУ@ȡÑdȤҵcÑdȤƃÑ]ͨ^Sȵķԓ㷨]ȡ
ڃݵ]ҪÑĿuӛ䛣½YԴĿ]QĿ䆢ӆ}fͬ^V]㷨Ñąfͬ^V(UserCF)ͻĿąfͬ^V(ItemCF)һNڽ]㷨[15]Ʒ͈D^YԴ]rItemCFÑҌ@ƷrdȤDZ^ģ˿]cǰg[ƷƵƷ
PSRSSУҪÑ]PIͬЮǰPעĿYԴrYԴĕrЧԡIԺ͟ȸÑϵyʹ־WõdȤáÑąfͬ^V]߀Ñ]@ϲYԴĿPSRSSđÈҪں]ϵy䆢AûĿȵ]㷨Ñ]PIоĸֵ^ߵĿYԴ;Ñ^ϵyʹОӛ䛺xûÑąfͬ^V]㷨Ñ]dȤıWƌIͬиdȤĿYԴ;Ñղءxdij헿YԴrxûڃݵ]㷨Ñ]c䮔ǰdȤYԴƵĿYԴ
3.2㷨
3.2.1ĿֵӋ
Ñ_ʼʹPSRSSrϵyǟoÑṩԻյRÑ䆢ӆ}˕rûĿȵ]㷨YԴڌIWơоȻϢMЄȻĿȌYԴMֵ^Ŀ]odȤÑ
һYԴĿϵy͞ʼһȷĿҲͬrM]xбͬYԴijʼȷDzһӵԸYԴeߵČIˮƽ猣IQȗlx費ͬYԴͬijʼֵSYԴĿ౻ÑxղdıÑОӰ푵ğȲӡ߀ӰYԴȵʹYԴȽͣrg
3.2.2Ŀ
PSRSSҪ]ǷǽYĿYԴęnֱӌӳ䵽g@ЩYԴĘ}PԓYԴĺPIϢܷӳYԴҪÑҲҪøYԴĿĘ}Ϣ팦һĄc_xղdֱ^QTFIDF㷨Ŀ}ȡĿPI~PI~TFIDFֵԓPI~ęֵĿϢĿ}ӳʾĿӋĿ֮gƶ[1617]Mлڃݵ]
4YԴTop]
@AξڃÈҪx]㷨AӋÑ߀]ʹ^ĿYԴdȤÑdȤȺYԴ]YԴбMбǰYԴ]oÑ
4.1Ñ䆢A
@AΣĿğֵÑM]ʹʽӋĿֵԸߌIQe鲻ͬÑOòͬęֵ磺мߙֵ0.6ߙֵ0.8ߙֵ顣=0.2×ղشΔ+0.4×xΔ+0.4×dΔӋÑО錦Ŀֵĸ¡
ϵyAԾCϿ]YԴÿYԴx費ͬijʼֵϵy\кԽYÿYԴƽֵӋ½Ŀʼֵڴ˻AϣYĿߵęֵʹʽӋÿYԴĿĮǰֵYԴeÿYԴȽcÑIоPǰ헸YԴ]oÑ
4.2YԴĿ]
ʹPythonjieba~YԴęn}Mз~̎ڴ˻Aȥͣ~ȻʹTFIDFӋ~TFIDFֵĿ}PI~ÑijYԴĿMxdȸdȤIJϵyǰĿPI~ʹƶȹʽ(11)ӋcԓĿƶȣȻĿƶȽĽYÑTop]
5ϵyЧu
ᘌϵyں]ʹ]ʴ_urϵy]ЧuָˣҪÑʹPSRSSϵya惦ÑYԴĿuֱuserresitemscoreĔ錍@ЩÑYԴĸNӛݞg[dղԓÑ206YԴĿ124헼ÑYԴĿ35215YԴ80%Ӗ20%yԇӋϵyں]]ʴ_
ᘌĿȺͻĿݵ]Y@ʾ]бLȞr^Üʴ_ʣSʴ_u½^СrĿȵ]Ч@ӳÑǰcĿPע^^mڃݵ]Чӳ˕rеČWƌIԼÑcԼǰоPĿYԴPע]ЧиӰ
6YZ
{˸УÑĿYԴԻOӋں]ϵyܘÈxm]㷨MᘌԵă]xrgĽÑdȤȵӰxrgӰÑdȤֵӋ;YԴĿÑĵQϡ蔵ӋЧʆ};Mлڃݵ]rÿÑČIоMз]Ĝʴ_;Ñֵ͕rgӰӋĿֵQϵy䆢ӆ}Y϶N]ԣں]]Чʺ]ʴ_邀ԻYԴϵyĽOṩµąо߀MһھÑĴYԴϵyܘÑ]M;ϵyOӋAPIӿڣչ]YԴ
īI
,.У̌WеӰc̽[J].ӋCcƌW,2019,41(S1):238241.QinFD,LiJ.Influenceandexplorationofbigdataonuniversityteachingandresearch[J].ComputerEngineering&Science,2019,41(S1):238241.(inChinese)
LindenSmithYorkJ.Amazon.comrecommendations:itemtoitemcollaborativefiltering[J].IEEEInternetComputing2003,):7680.
C.A.GomezUribeandN.HuntTheNetflixRecommendersystem:algorithms,businessvalue,andinnovation[J].ACMTransactionsonManagementInformationSystems2016,):19.
.До[J].s־,2020,39(6):203207.ChenYY.Onresearchdatamanagementserviceabilityofcollegesanduniversities[J].JournalofIntelligence,2020,39(6):203207.(inChinese)
Ɲ,.҇Дcƽ_{c^[J].YϹ,2017(6):9095.LiuZH,ZengLY.InvestigationandcomparativeanalysisofscientificresearchdatamanagementandsharingplatformofuniversitiesinChina[J].InformationandDocumentationServices,2017(6):9095.(inChinese)
ߣ
DdՈעlWgՓľWhttp://www.cnzjbx.cn/jylw/29600.html