1. 잘 사용하던 Daum Movie Scraper 가 Daum이 서비스를 변경하면서 작동이 되지 않음.
2. 다른 부분(detail.json, cast_crew.json, list.json)은 여전히 사용이 가능한데 search api (movie.json)가 작동을 안함.
3. Kodi Scraper How-To : https://kodi.wiki/view/Scrapers 를 참고함.
4. 기존 개발된 scraper를 기반으로 Daum 검색페이지를 parsing 해서 작동은 가능하도록 수정함
- https://github.com/hojel/metadata.movie.daum.net 을 설치함. (* Plex daum agent 만드신 분이네요)
- <CreateSearchUrl > http:://~~, <GetSearchResults> <expression> 부분을 수정함.
- 수정 전
<CreateSearchUrl dest="3">
<RegExp input="$$1" output="<url>http://movie.daum.net/data/movie/search/v2/movie.json?size=20&start=1&searchText=\1</url>" dest="3">
<expression noclean="1" />
</RegExp>
</CreateSearchUrl>
<GetSearchResults dest="4">
<RegExp input="$$5" output="<?xml version="1.0" encoding="UTF-8" standalone="yes"?><results>\1</results>" dest="4">
<RegExp conditional="!OrigTitleInSrchResult" input="$$1" output="<entity><title>\2</title><year>\4</year><id>\1</id><url cache="daum-movie-\1.json">http://movie.daum.net/data/movie/movie_info/detail.json?movieId=\1</url></entity>" dest="5">
<expression repeat="yes" trim="2,3">"movieId":(\d+),"titleKo":"([^"]*)","titleEn":"?([^",]*)"?,[^}]*"prodYear":(\d*)</expression>
</RegExp>
<RegExp conditional="OrigTitleInSrchResult" input="$$6" output="\1" dest="5">
<RegExp input="$$1" output="<entity><title>\2(\3)</title><year>\4</year><id>\1</id><url cache="daum-movie-\1.json">http://movie.daum.net/data/movie/movie_info/detail.json?movieId=\1</url></entity>" dest="6">
<expression repeat="yes" trim="2,3">"movieId":(\d+),"titleKo":"([^"]*)","titleEn":"?([^",]*)"?,[^}]*"prodYear":(\d*)</expression>
</RegExp>
<expression noclean="1" />
</RegExp>
<expression noclean="1" />
</RegExp>
</GetSearchResults>
- 수정 후
<CreateSearchUrl dest="3">
<RegExp input="$$1" output="<url>http://search.daum.net/search?w=tot&q=\1</url>" dest="3">
<expression noclean="1" />
</RegExp>
</CreateSearchUrl>
<GetSearchResults dest="4">
<RegExp input="$$5" output="<?xml version="1.0" encoding="UTF-8" standalone="yes"?><results>\1</results>" dest="4">
<RegExp conditional="!OrigTitleInSrchResult" input="$$1" output="<entity><title>\2</title><year>\4</year><id>\1</id><url cache="daum-movie-\1.json">http://movie.daum.net/data/movie/movie_info/detail.json?movieId=\1</url></entity>" dest="5">
<expression clean="1">movie\.daum\.net[^\?]*\?movieId=(\d*)[^\?]*tit_name"><b>(.[^"]*)</b>[^\?]*tit_sub">(.[^"]*),[^\?]*(\d{4})</expression>
</RegExp>
<RegExp conditional="OrigTitleInSrchResult" input="$$6" output="\1" dest="5">
<RegExp input="$$1" output="<entity><title>\2(\3)</title><year>\4</year><id>\1</id><url cache="daum-movie-\1.json">http://movie.daum.net/data/movie/movie_info/detail.json?movieId=\1</url></entity>" dest="6">
<expression clean="1">movie\.daum\.net[^\?]*\?movieId=(\d*)[^\?]*tit_name"><b>(.[^"]*)</b>[^\?]*tit_sub">(.[^"]*),[^\?]*(\d{4})</expression>
</RegExp>
<expression noclean="1" />
</RegExp>
<expression noclean="1" />
</RegExp>
</GetSearchResults>
5. TODO
- daum search를 movie 한정, movie 페이지 검색의 suggest 활용 (plex daum agent 에서 활용 중)
- CreateSearchUrl : 한글, 숫자 제목만 뽑아서 search keywords로
- GetSearchResults : 정규식 고도화 (아무것도 모르고 카피함. ^^;;)
- 검색안되는 것들 메뉴얼 검색해서 movieId만 넣어도 되도
- zip 으로 배포되도록
(혼란을 막기 위해 최신 version 만 남기고 지난 version은 삭제함.)
'Electronics' 카테고리의 다른 글
(v4) Kodi Daum Movie Scraper Add-on 작동법 (3) (26) | 2019.09.26 |
---|---|
(update) Kodi: Daum Movie Scraper Add-on 작동법 (2) (7) | 2019.09.14 |
LG G Pad 7.0 LTE V410, Lineage OS 설치 (0) | 2018.09.29 |
EBS 녹음 (18.08.06 외국어채널2 반영) (1) | 2018.09.16 |
Google Play Music Sync CLI with gmusicapi on Linux (0) | 2018.06.20 |