mirror of
https://source.netsyms.com/Mirrors/youtube-dl
synced 2026-03-30 15:52:23 +00:00
Compare commits
52 Commits
2016.08.07
...
2016.08.12
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
b0081562d2 | ||
|
|
fff37cfd4f | ||
|
|
a3be69b7f0 | ||
|
|
0fd1b1624c | ||
|
|
367976d49f | ||
|
|
0aef0771f8 | ||
|
|
0c070681c5 | ||
|
|
30b25d382d | ||
|
|
e5f878c205 | ||
|
|
e2e84aed7e | ||
|
|
b1927f4e8a | ||
|
|
3b9323d96e | ||
|
|
7f832413d6 | ||
|
|
7f2ed47595 | ||
|
|
c3fa77bdef | ||
|
|
57ce8a6d08 | ||
|
|
69d8eeeec5 | ||
|
|
81c13222c6 | ||
|
|
b1ce2ba197 | ||
|
|
5c8411e968 | ||
|
|
cc9c8ce5df | ||
|
|
20ef4123b9 | ||
|
|
4e62d26aa2 | ||
|
|
b657816684 | ||
|
|
9778b3e7ee | ||
|
|
25dd58ca6a | ||
|
|
5e42f8a0ad | ||
|
|
1ad6b891b2 | ||
|
|
7aa589a5e1 | ||
|
|
065bc35489 | ||
|
|
3a380766d1 | ||
|
|
affaea0688 | ||
|
|
77426a087b | ||
|
|
8991844ea2 | ||
|
|
082395d0a0 | ||
|
|
e8ed7354e6 | ||
|
|
1e7f602e2a | ||
|
|
522f6c066d | ||
|
|
321b5e082a | ||
|
|
3711fa1eb2 | ||
|
|
395c74615c | ||
|
|
3dc240e8c6 | ||
|
|
a41a6c5094 | ||
|
|
d71207121d | ||
|
|
b1c6f21c74 | ||
|
|
412abb8760 | ||
|
|
f17d5f6d14 | ||
|
|
6bb801cfaf | ||
|
|
de02d1f4e9 | ||
|
|
e1f93a0a76 | ||
|
|
d21a661bb4 | ||
|
|
b2bd968f4b |
6
.github/ISSUE_TEMPLATE.md
vendored
6
.github/ISSUE_TEMPLATE.md
vendored
@@ -6,8 +6,8 @@
|
||||
|
||||
---
|
||||
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.08.07*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.08.07**
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.08.12*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.08.12**
|
||||
|
||||
### Before submitting an *issue* make sure you have:
|
||||
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
||||
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
|
||||
[debug] User config: []
|
||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||
[debug] youtube-dl version 2016.08.07
|
||||
[debug] youtube-dl version 2016.08.12
|
||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||
[debug] Proxy map: {}
|
||||
|
||||
2
AUTHORS
2
AUTHORS
@@ -179,3 +179,5 @@ Jakub Adam Wieczorek
|
||||
Aleksandar Topuzović
|
||||
Nehal Patel
|
||||
Rob van Bekkum
|
||||
Petr Zvoníček
|
||||
Pratyush Singh
|
||||
|
||||
@@ -46,7 +46,7 @@ Make sure that someone has not already opened the issue you're trying to open. S
|
||||
|
||||
### Why are existing options not enough?
|
||||
|
||||
Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/rg3/youtube-dl/blob/master/README.md#synopsis). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
|
||||
Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/rg3/youtube-dl/blob/master/README.md#options). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
|
||||
|
||||
### Is there enough context in your bug report?
|
||||
|
||||
|
||||
36
ChangeLog
36
ChangeLog
@@ -1,3 +1,39 @@
|
||||
version 2016.08.12
|
||||
|
||||
Core
|
||||
* Subtitles are now written as is. Newline conversions are disabled. (#10268)
|
||||
+ Recognize more formats in unified_timestamp
|
||||
|
||||
Extractors
|
||||
- [goldenmoustache] Remove extractor (#10298)
|
||||
* [drtuber] Improve title extraction
|
||||
* [drtuber] Make dislike count optional (#10297)
|
||||
* [chirbit] Fix extraction (#10296)
|
||||
* [francetvinfo] Relax URL regular expression
|
||||
* [rtlnl] Relax URL regular expression (#10282)
|
||||
* [formula1] Relax URL regular expression (#10283)
|
||||
* [wat] Improve extraction (#10281)
|
||||
* [ctsnews] Fix extraction
|
||||
|
||||
|
||||
version 2016.08.10
|
||||
|
||||
Core
|
||||
* Make --metadata-from-title non fatal when title does not match the pattern
|
||||
* Introduce options for randomized sleep before each download
|
||||
--min-sleep-interval and --max-sleep-interval (#9930)
|
||||
* Respect default in _search_json_ld
|
||||
|
||||
Extractors
|
||||
+ [uol] Add extractor for uol.com.br (#4263)
|
||||
* [rbmaradio] Fix extraction and extract all formats (#10242)
|
||||
+ [sonyliv] Add extractor for sonyliv.com (#10258)
|
||||
* [aparat] Fix extraction
|
||||
* [cwtv] Extract HTTP formats
|
||||
+ [rozhlas] Add extractor for prehravac.rozhlas.cz (#10253)
|
||||
* [kuwo:singer] Fix extraction
|
||||
|
||||
|
||||
version 2016.08.07
|
||||
|
||||
Core
|
||||
|
||||
12
README.md
12
README.md
@@ -330,7 +330,15 @@ which means you can modify it, redistribute it or use it however you like.
|
||||
bidirectional text support. Requires bidiv
|
||||
or fribidi executable in PATH
|
||||
--sleep-interval SECONDS Number of seconds to sleep before each
|
||||
download.
|
||||
download when used alone or a lower bound
|
||||
of a range for randomized sleep before each
|
||||
download (minimum possible number of
|
||||
seconds to sleep) when used along with
|
||||
--max-sleep-interval.
|
||||
--max-sleep-interval SECONDS Upper bound of a range for randomized sleep
|
||||
before each download (maximum possible
|
||||
number of seconds to sleep). Must only be
|
||||
used along with --min-sleep-interval.
|
||||
|
||||
## Video Format Options:
|
||||
-f, --format FORMAT Video format code, see the "FORMAT
|
||||
@@ -1196,7 +1204,7 @@ Make sure that someone has not already opened the issue you're trying to open. S
|
||||
|
||||
### Why are existing options not enough?
|
||||
|
||||
Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/rg3/youtube-dl/blob/master/README.md#synopsis). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
|
||||
Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/rg3/youtube-dl/blob/master/README.md#options). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
|
||||
|
||||
### Is there enough context in your bug report?
|
||||
|
||||
|
||||
@@ -54,17 +54,21 @@ def filter_options(readme):
|
||||
|
||||
if in_options:
|
||||
if line.lstrip().startswith('-'):
|
||||
option, description = re.split(r'\s{2,}', line.lstrip())
|
||||
split_option = option.split(' ')
|
||||
split = re.split(r'\s{2,}', line.lstrip())
|
||||
# Description string may start with `-` as well. If there is
|
||||
# only one piece then it's a description bit not an option.
|
||||
if len(split) > 1:
|
||||
option, description = split
|
||||
split_option = option.split(' ')
|
||||
|
||||
if not split_option[-1].startswith('-'): # metavar
|
||||
option = ' '.join(split_option[:-1] + ['*%s*' % split_option[-1]])
|
||||
if not split_option[-1].startswith('-'): # metavar
|
||||
option = ' '.join(split_option[:-1] + ['*%s*' % split_option[-1]])
|
||||
|
||||
# Pandoc's definition_lists. See http://pandoc.org/README.html
|
||||
# for more information.
|
||||
ret += '\n%s\n: %s\n' % (option, description)
|
||||
else:
|
||||
ret += line.lstrip() + '\n'
|
||||
# Pandoc's definition_lists. See http://pandoc.org/README.html
|
||||
# for more information.
|
||||
ret += '\n%s\n: %s\n' % (option, description)
|
||||
continue
|
||||
ret += line.lstrip() + '\n'
|
||||
else:
|
||||
ret += line + '\n'
|
||||
|
||||
|
||||
@@ -265,7 +265,6 @@
|
||||
- **GloboArticle**
|
||||
- **GodTube**
|
||||
- **GodTV**
|
||||
- **GoldenMoustache**
|
||||
- **Golem**
|
||||
- **GoogleDrive**
|
||||
- **Goshgay**
|
||||
@@ -564,6 +563,7 @@
|
||||
- **RoosterTeeth**
|
||||
- **RottenTomatoes**
|
||||
- **Roxwel**
|
||||
- **Rozhlas**
|
||||
- **RTBF**
|
||||
- **rte**: Raidió Teilifís Éireann TV
|
||||
- **rte:radio**: Raidió Teilifís Éireann radio
|
||||
@@ -621,6 +621,7 @@
|
||||
- **smotri:user**: Smotri.com user videos
|
||||
- **Snotr**
|
||||
- **Sohu**
|
||||
- **SonyLIV**
|
||||
- **soundcloud**
|
||||
- **soundcloud:playlist**
|
||||
- **soundcloud:search**: Soundcloud search
|
||||
@@ -747,6 +748,7 @@
|
||||
- **udemy:course**
|
||||
- **UDNEmbed**: 聯合影音
|
||||
- **Unistra**
|
||||
- **uol.com.br**
|
||||
- **Urort**: NRK P3 Urørt
|
||||
- **URPlay**
|
||||
- **USAToday**
|
||||
|
||||
@@ -249,7 +249,16 @@ class YoutubeDL(object):
|
||||
source_address: (Experimental) Client-side IP address to bind to.
|
||||
call_home: Boolean, true iff we are allowed to contact the
|
||||
youtube-dl servers for debugging.
|
||||
sleep_interval: Number of seconds to sleep before each download.
|
||||
sleep_interval: Number of seconds to sleep before each download when
|
||||
used alone or a lower bound of a range for randomized
|
||||
sleep before each download (minimum possible number
|
||||
of seconds to sleep) when used along with
|
||||
max_sleep_interval.
|
||||
max_sleep_interval:Upper bound of a range for randomized sleep before each
|
||||
download (maximum possible number of seconds to sleep).
|
||||
Must only be used along with sleep_interval.
|
||||
Actual sleep time will be a random float from range
|
||||
[sleep_interval; max_sleep_interval].
|
||||
listformats: Print an overview of available video formats and exit.
|
||||
list_thumbnails: Print a table of all thumbnails and exit.
|
||||
match_filter: A function that gets called with the info_dict of
|
||||
@@ -1594,7 +1603,9 @@ class YoutubeDL(object):
|
||||
self.to_screen('[info] Video subtitle %s.%s is already_present' % (sub_lang, sub_format))
|
||||
else:
|
||||
self.to_screen('[info] Writing video subtitles to: ' + sub_filename)
|
||||
with io.open(encodeFilename(sub_filename), 'w', encoding='utf-8') as subfile:
|
||||
# Use newline='' to prevent conversion of newline characters
|
||||
# See https://github.com/rg3/youtube-dl/issues/10268
|
||||
with io.open(encodeFilename(sub_filename), 'w', encoding='utf-8', newline='') as subfile:
|
||||
subfile.write(sub_data)
|
||||
except (OSError, IOError):
|
||||
self.report_error('Cannot write subtitles file ' + sub_filename)
|
||||
|
||||
@@ -145,6 +145,16 @@ def _real_main(argv=None):
|
||||
if numeric_limit is None:
|
||||
parser.error('invalid max_filesize specified')
|
||||
opts.max_filesize = numeric_limit
|
||||
if opts.sleep_interval is not None:
|
||||
if opts.sleep_interval < 0:
|
||||
parser.error('sleep interval must be positive or 0')
|
||||
if opts.max_sleep_interval is not None:
|
||||
if opts.max_sleep_interval < 0:
|
||||
parser.error('max sleep interval must be positive or 0')
|
||||
if opts.max_sleep_interval < opts.sleep_interval:
|
||||
parser.error('max sleep interval must be greater than or equal to min sleep interval')
|
||||
else:
|
||||
opts.max_sleep_interval = opts.sleep_interval
|
||||
|
||||
def parse_retries(retries):
|
||||
if retries in ('inf', 'infinite'):
|
||||
@@ -370,6 +380,7 @@ def _real_main(argv=None):
|
||||
'source_address': opts.source_address,
|
||||
'call_home': opts.call_home,
|
||||
'sleep_interval': opts.sleep_interval,
|
||||
'max_sleep_interval': opts.max_sleep_interval,
|
||||
'external_downloader': opts.external_downloader,
|
||||
'list_thumbnails': opts.list_thumbnails,
|
||||
'playlist_items': opts.playlist_items,
|
||||
|
||||
@@ -4,6 +4,7 @@ import os
|
||||
import re
|
||||
import sys
|
||||
import time
|
||||
import random
|
||||
|
||||
from ..compat import compat_os_name
|
||||
from ..utils import (
|
||||
@@ -342,8 +343,11 @@ class FileDownloader(object):
|
||||
})
|
||||
return True
|
||||
|
||||
sleep_interval = self.params.get('sleep_interval')
|
||||
if sleep_interval:
|
||||
min_sleep_interval = self.params.get('sleep_interval')
|
||||
if min_sleep_interval:
|
||||
max_sleep_interval = self.params.get('max_sleep_interval', min_sleep_interval)
|
||||
print(min_sleep_interval, max_sleep_interval)
|
||||
sleep_interval = random.uniform(min_sleep_interval, max_sleep_interval)
|
||||
self.to_screen('[download] Sleeping %s seconds...' % sleep_interval)
|
||||
time.sleep(sleep_interval)
|
||||
|
||||
|
||||
@@ -123,6 +123,10 @@ class AolFeaturesIE(InfoExtractor):
|
||||
'title': 'What To Watch - February 17, 2016',
|
||||
},
|
||||
'add_ie': ['FiveMin'],
|
||||
'params': {
|
||||
# encrypted m3u8 download
|
||||
'skip_download': True,
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
||||
@@ -1,8 +1,6 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
@@ -15,7 +13,7 @@ class AparatIE(InfoExtractor):
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.aparat.com/v/wP8On',
|
||||
'md5': '6714e0af7e0d875c5a39c4dc4ab46ad1',
|
||||
'md5': '131aca2e14fe7c4dcb3c4877ba300c89',
|
||||
'info_dict': {
|
||||
'id': 'wP8On',
|
||||
'ext': 'mp4',
|
||||
@@ -31,13 +29,13 @@ class AparatIE(InfoExtractor):
|
||||
# Note: There is an easier-to-parse configuration at
|
||||
# http://www.aparat.com/video/video/config/videohash/%video_id
|
||||
# but the URL in there does not work
|
||||
embed_url = ('http://www.aparat.com/video/video/embed/videohash/' +
|
||||
video_id + '/vt/frame')
|
||||
embed_url = 'http://www.aparat.com/video/video/embed/vt/frame/showvideo/yes/videohash/' + video_id
|
||||
webpage = self._download_webpage(embed_url, video_id)
|
||||
|
||||
video_urls = [video_url.replace('\\/', '/') for video_url in re.findall(
|
||||
r'(?:fileList\[[0-9]+\]\s*=|"file"\s*:)\s*"([^"]+)"', webpage)]
|
||||
for i, video_url in enumerate(video_urls):
|
||||
file_list = self._parse_json(self._search_regex(
|
||||
r'fileList\s*=\s*JSON\.parse\(\'([^\']+)\'\)', webpage, 'file list'), video_id)
|
||||
for i, item in enumerate(file_list[0]):
|
||||
video_url = item['file']
|
||||
req = HEADRequest(video_url)
|
||||
res = self._request_webpage(
|
||||
req, video_id, note='Testing video URL %d' % i, errnote=False)
|
||||
|
||||
@@ -759,7 +759,7 @@ class BBCIE(BBCCoUkIE):
|
||||
|
||||
webpage = self._download_webpage(url, playlist_id)
|
||||
|
||||
json_ld_info = self._search_json_ld(webpage, playlist_id, default=None)
|
||||
json_ld_info = self._search_json_ld(webpage, playlist_id, default={})
|
||||
timestamp = json_ld_info.get('timestamp')
|
||||
|
||||
playlist_title = json_ld_info.get('title')
|
||||
|
||||
@@ -25,13 +25,13 @@ class BiliBiliIE(InfoExtractor):
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.bilibili.tv/video/av1074402/',
|
||||
'md5': '5f7d29e1a2872f3df0cf76b1f87d3788',
|
||||
'md5': '9fa226fe2b8a9a4d5a69b4c6a183417e',
|
||||
'info_dict': {
|
||||
'id': '1554319',
|
||||
'ext': 'flv',
|
||||
'ext': 'mp4',
|
||||
'title': '【金坷垃】金泡沫',
|
||||
'description': 'md5:ce18c2a2d2193f0df2917d270f2e5923',
|
||||
'duration': 308.067,
|
||||
'duration': 308.315,
|
||||
'timestamp': 1398012660,
|
||||
'upload_date': '20140420',
|
||||
'thumbnail': 're:^https?://.+\.jpg',
|
||||
@@ -41,73 +41,33 @@ class BiliBiliIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://www.bilibili.com/video/av1041170/',
|
||||
'info_dict': {
|
||||
'id': '1041170',
|
||||
'id': '1507019',
|
||||
'ext': 'mp4',
|
||||
'title': '【BD1080P】刀语【诸神&异域】',
|
||||
'description': '这是个神奇的故事~每个人不留弹幕不给走哦~切利哦!~',
|
||||
'timestamp': 1396530060,
|
||||
'upload_date': '20140403',
|
||||
'uploader': '枫叶逝去',
|
||||
'uploader_id': '520116',
|
||||
},
|
||||
'playlist_count': 9,
|
||||
}, {
|
||||
'url': 'http://www.bilibili.com/video/av4808130/',
|
||||
'info_dict': {
|
||||
'id': '4808130',
|
||||
'id': '7802182',
|
||||
'ext': 'mp4',
|
||||
'title': '【长篇】哆啦A梦443【钉铛】',
|
||||
'description': '(2016.05.27)来组合客人的脸吧&amp;寻母六千里锭 抱歉,又轮到周日上班现在才到家 封面www.pixiv.net/member_illust.php?mode=medium&amp;illust_id=56912929',
|
||||
'timestamp': 1464564180,
|
||||
'upload_date': '20160529',
|
||||
'uploader': '喜欢拉面',
|
||||
'uploader_id': '151066',
|
||||
},
|
||||
'playlist': [{
|
||||
'md5': '55cdadedf3254caaa0d5d27cf20a8f9c',
|
||||
'info_dict': {
|
||||
'id': '4808130_part1',
|
||||
'ext': 'flv',
|
||||
'title': '【长篇】哆啦A梦443【钉铛】',
|
||||
'description': '(2016.05.27)来组合客人的脸吧&amp;寻母六千里锭 抱歉,又轮到周日上班现在才到家 封面www.pixiv.net/member_illust.php?mode=medium&amp;illust_id=56912929',
|
||||
'timestamp': 1464564180,
|
||||
'upload_date': '20160529',
|
||||
'uploader': '喜欢拉面',
|
||||
'uploader_id': '151066',
|
||||
},
|
||||
}, {
|
||||
'md5': '926f9f67d0c482091872fbd8eca7ea3d',
|
||||
'info_dict': {
|
||||
'id': '4808130_part2',
|
||||
'ext': 'flv',
|
||||
'title': '【长篇】哆啦A梦443【钉铛】',
|
||||
'description': '(2016.05.27)来组合客人的脸吧&amp;寻母六千里锭 抱歉,又轮到周日上班现在才到家 封面www.pixiv.net/member_illust.php?mode=medium&amp;illust_id=56912929',
|
||||
'timestamp': 1464564180,
|
||||
'upload_date': '20160529',
|
||||
'uploader': '喜欢拉面',
|
||||
'uploader_id': '151066',
|
||||
},
|
||||
}, {
|
||||
'md5': '4b7b225b968402d7c32348c646f1fd83',
|
||||
'info_dict': {
|
||||
'id': '4808130_part3',
|
||||
'ext': 'flv',
|
||||
'title': '【长篇】哆啦A梦443【钉铛】',
|
||||
'description': '(2016.05.27)来组合客人的脸吧&amp;寻母六千里锭 抱歉,又轮到周日上班现在才到家 封面www.pixiv.net/member_illust.php?mode=medium&amp;illust_id=56912929',
|
||||
'timestamp': 1464564180,
|
||||
'upload_date': '20160529',
|
||||
'uploader': '喜欢拉面',
|
||||
'uploader_id': '151066',
|
||||
},
|
||||
}, {
|
||||
'md5': '7b795e214166501e9141139eea236e91',
|
||||
'info_dict': {
|
||||
'id': '4808130_part4',
|
||||
'ext': 'flv',
|
||||
'title': '【长篇】哆啦A梦443【钉铛】',
|
||||
'description': '(2016.05.27)来组合客人的脸吧&amp;寻母六千里锭 抱歉,又轮到周日上班现在才到家 封面www.pixiv.net/member_illust.php?mode=medium&amp;illust_id=56912929',
|
||||
'timestamp': 1464564180,
|
||||
'upload_date': '20160529',
|
||||
'uploader': '喜欢拉面',
|
||||
'uploader_id': '151066',
|
||||
},
|
||||
}],
|
||||
}, {
|
||||
# Missing upload time
|
||||
'url': 'http://www.bilibili.com/video/av1867637/',
|
||||
'info_dict': {
|
||||
'id': '2880301',
|
||||
'ext': 'flv',
|
||||
'ext': 'mp4',
|
||||
'title': '【HDTV】【喜剧】岳父岳母真难当 (2014)【法国票房冠军】',
|
||||
'description': '一个信奉天主教的法国旧式传统资产阶级家庭中有四个女儿。三个女儿却分别找了阿拉伯、犹太、中国丈夫,老夫老妻唯独期盼剩下未嫁的小女儿能找一个信奉天主教的法国白人,结果没想到小女儿找了一位非裔黑人……【这次应该不会跳帧了】',
|
||||
'uploader': '黑夜为猫',
|
||||
|
||||
@@ -24,7 +24,8 @@ class BIQLEIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Ребенок в шоке от автоматической мойки',
|
||||
'uploader': 'Dmitry Kotov',
|
||||
}
|
||||
},
|
||||
'skip': ' This video was marked as adult. Embedding adult videos on external sites is prohibited.',
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
||||
@@ -17,7 +17,8 @@ class ChaturbateIE(InfoExtractor):
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
}
|
||||
},
|
||||
'skip': 'Room is offline',
|
||||
}, {
|
||||
'url': 'https://en.chaturbate.com/siswet19/',
|
||||
'only_matching': True,
|
||||
|
||||
@@ -1,30 +1,33 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
parse_duration,
|
||||
int_or_none,
|
||||
)
|
||||
from ..utils import parse_duration
|
||||
|
||||
|
||||
class ChirbitIE(InfoExtractor):
|
||||
IE_NAME = 'chirbit'
|
||||
_VALID_URL = r'https?://(?:www\.)?chirb\.it/(?:(?:wp|pl)/|fb_chirbit_player\.swf\?key=)?(?P<id>[\da-zA-Z]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://chirb.it/PrIPv5',
|
||||
'md5': '9847b0dad6ac3e074568bf2cfb197de8',
|
||||
'url': 'http://chirb.it/be2abG',
|
||||
'info_dict': {
|
||||
'id': 'PrIPv5',
|
||||
'id': 'be2abG',
|
||||
'ext': 'mp3',
|
||||
'title': 'Фасадстрой',
|
||||
'duration': 52,
|
||||
'view_count': int,
|
||||
'comment_count': int,
|
||||
'title': 'md5:f542ea253f5255240be4da375c6a5d7e',
|
||||
'description': 'md5:f24a4e22a71763e32da5fed59e47c770',
|
||||
'duration': 306,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
}
|
||||
}, {
|
||||
'url': 'https://chirb.it/fb_chirbit_player.swf?key=PrIPv5',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://chirb.it/wp/MN58c2',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
@@ -33,27 +36,30 @@ class ChirbitIE(InfoExtractor):
|
||||
webpage = self._download_webpage(
|
||||
'http://chirb.it/%s' % audio_id, audio_id)
|
||||
|
||||
audio_url = self._search_regex(
|
||||
r'"setFile"\s*,\s*"([^"]+)"', webpage, 'audio url')
|
||||
data_fd = self._search_regex(
|
||||
r'data-fd=(["\'])(?P<url>(?:(?!\1).)+)\1',
|
||||
webpage, 'data fd', group='url')
|
||||
|
||||
# Reverse engineered from https://chirb.it/js/chirbit.player.js (look
|
||||
# for soundURL)
|
||||
audio_url = base64.b64decode(
|
||||
data_fd[::-1].encode('ascii')).decode('utf-8')
|
||||
|
||||
title = self._search_regex(
|
||||
r'itemprop="name">([^<]+)', webpage, 'title')
|
||||
duration = parse_duration(self._html_search_meta(
|
||||
'duration', webpage, 'duration', fatal=False))
|
||||
view_count = int_or_none(self._search_regex(
|
||||
r'itemprop="playCount"\s*>(\d+)', webpage,
|
||||
'listen count', fatal=False))
|
||||
comment_count = int_or_none(self._search_regex(
|
||||
r'>(\d+) Comments?:', webpage,
|
||||
'comment count', fatal=False))
|
||||
r'class=["\']chirbit-title["\'][^>]*>([^<]+)', webpage, 'title')
|
||||
description = self._search_regex(
|
||||
r'<h3>Description</h3>\s*<pre[^>]*>([^<]+)</pre>',
|
||||
webpage, 'description', default=None)
|
||||
duration = parse_duration(self._search_regex(
|
||||
r'class=["\']c-length["\'][^>]*>([^<]+)',
|
||||
webpage, 'duration', fatal=False))
|
||||
|
||||
return {
|
||||
'id': audio_id,
|
||||
'url': audio_url,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'duration': duration,
|
||||
'view_count': view_count,
|
||||
'comment_count': comment_count,
|
||||
}
|
||||
|
||||
|
||||
|
||||
@@ -816,11 +816,14 @@ class InfoExtractor(object):
|
||||
json_ld = self._search_regex(
|
||||
r'(?s)<script[^>]+type=(["\'])application/ld\+json\1[^>]*>(?P<json_ld>.+?)</script>',
|
||||
html, 'JSON-LD', group='json_ld', **kwargs)
|
||||
default = kwargs.get('default', NO_DEFAULT)
|
||||
if not json_ld:
|
||||
return {}
|
||||
return self._json_ld(
|
||||
json_ld, video_id, fatal=kwargs.get('fatal', True),
|
||||
expected_type=expected_type)
|
||||
return default if default is not NO_DEFAULT else {}
|
||||
# JSON-LD may be malformed and thus `fatal` should be respected.
|
||||
# At the same time `default` may be passed that assumes `fatal=False`
|
||||
# for _search_regex. Let's simulate the same behavior here as well.
|
||||
fatal = kwargs.get('fatal', True) if default == NO_DEFAULT else False
|
||||
return self._json_ld(json_ld, video_id, fatal=fatal, expected_type=expected_type)
|
||||
|
||||
def _json_ld(self, json_ld, video_id, fatal=True, expected_type=None):
|
||||
if isinstance(json_ld, compat_str):
|
||||
|
||||
@@ -143,7 +143,8 @@ class CondeNastIE(InfoExtractor):
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
info = self._search_json_ld(webpage, video_id) if url_type != 'embed' else {}
|
||||
info = self._search_json_ld(
|
||||
webpage, video_id, fatal=False) if url_type != 'embed' else {}
|
||||
info.update({
|
||||
'id': video_id,
|
||||
'formats': formats,
|
||||
|
||||
@@ -1,13 +1,12 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import parse_iso8601, ExtractorError
|
||||
from ..utils import unified_timestamp
|
||||
|
||||
|
||||
class CtsNewsIE(InfoExtractor):
|
||||
IE_DESC = '華視新聞'
|
||||
# https connection failed (Connection reset)
|
||||
_VALID_URL = r'https?://news\.cts\.com\.tw/[a-z]+/[a-z]+/\d+/(?P<id>\d+)\.html'
|
||||
_TESTS = [{
|
||||
'url': 'http://news.cts.com.tw/cts/international/201501/201501291578109.html',
|
||||
@@ -16,7 +15,7 @@ class CtsNewsIE(InfoExtractor):
|
||||
'id': '201501291578109',
|
||||
'ext': 'mp4',
|
||||
'title': '以色列.真主黨交火 3人死亡',
|
||||
'description': 'md5:95e9b295c898b7ff294f09d450178d7d',
|
||||
'description': '以色列和黎巴嫩真主黨,爆發五年最嚴重衝突,雙方砲轟交火,兩名以軍死亡,還有一名西班牙籍的聯合國維和人...',
|
||||
'timestamp': 1422528540,
|
||||
'upload_date': '20150129',
|
||||
}
|
||||
@@ -28,7 +27,7 @@ class CtsNewsIE(InfoExtractor):
|
||||
'id': '201309031304098',
|
||||
'ext': 'mp4',
|
||||
'title': '韓國31歲童顏男 貌如十多歲小孩',
|
||||
'description': 'md5:f183feeba3752b683827aab71adad584',
|
||||
'description': '越有年紀的人,越希望看起來年輕一點,而南韓卻有一位31歲的男子,看起來像是11、12歲的小孩,身...',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'timestamp': 1378205880,
|
||||
'upload_date': '20130903',
|
||||
@@ -36,8 +35,7 @@ class CtsNewsIE(InfoExtractor):
|
||||
}, {
|
||||
# With Youtube embedded video
|
||||
'url': 'http://news.cts.com.tw/cts/money/201501/201501291578003.html',
|
||||
'md5': '1d842c771dc94c8c3bca5af2cc1db9c5',
|
||||
'add_ie': ['Youtube'],
|
||||
'md5': 'e4726b2ccd70ba2c319865e28f0a91d1',
|
||||
'info_dict': {
|
||||
'id': 'OVbfO7d0_hQ',
|
||||
'ext': 'mp4',
|
||||
@@ -47,42 +45,37 @@ class CtsNewsIE(InfoExtractor):
|
||||
'upload_date': '20150128',
|
||||
'uploader_id': 'TBSCTS',
|
||||
'uploader': '中華電視公司',
|
||||
}
|
||||
},
|
||||
'add_ie': ['Youtube'],
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
news_id = self._match_id(url)
|
||||
page = self._download_webpage(url, news_id)
|
||||
|
||||
if self._search_regex(r'(CTSPlayer2)', page, 'CTSPlayer2 identifier', default=None):
|
||||
feed_url = self._html_search_regex(
|
||||
r'(http://news\.cts\.com\.tw/action/mp4feed\.php\?news_id=\d+)',
|
||||
page, 'feed url')
|
||||
video_url = self._download_webpage(
|
||||
feed_url, news_id, note='Fetching feed')
|
||||
news_id = self._hidden_inputs(page).get('get_id')
|
||||
|
||||
if news_id:
|
||||
mp4_feed = self._download_json(
|
||||
'http://news.cts.com.tw/action/test_mp4feed.php',
|
||||
news_id, note='Fetching feed', query={'news_id': news_id})
|
||||
video_url = mp4_feed['source_url']
|
||||
else:
|
||||
self.to_screen('Not CTSPlayer video, trying Youtube...')
|
||||
youtube_url = self._search_regex(
|
||||
r'src="(//www\.youtube\.com/embed/[^"]+)"', page, 'youtube url',
|
||||
default=None)
|
||||
if not youtube_url:
|
||||
raise ExtractorError('The news includes no videos!', expected=True)
|
||||
r'src="(//www\.youtube\.com/embed/[^"]+)"', page, 'youtube url')
|
||||
|
||||
return {
|
||||
'_type': 'url',
|
||||
'url': youtube_url,
|
||||
'ie_key': 'Youtube',
|
||||
}
|
||||
return self.url_result(youtube_url, ie='Youtube')
|
||||
|
||||
description = self._html_search_meta('description', page)
|
||||
title = self._html_search_meta('title', page)
|
||||
title = self._html_search_meta('title', page, fatal=True)
|
||||
thumbnail = self._html_search_meta('image', page)
|
||||
|
||||
datetime_str = self._html_search_regex(
|
||||
r'(\d{4}/\d{2}/\d{2} \d{2}:\d{2})', page, 'date and time')
|
||||
# Transform into ISO 8601 format with timezone info
|
||||
datetime_str = datetime_str.replace('/', '-') + ':00+0800'
|
||||
timestamp = parse_iso8601(datetime_str, delimiter=' ')
|
||||
r'(\d{4}/\d{2}/\d{2} \d{2}:\d{2})', page, 'date and time', fatal=False)
|
||||
timestamp = None
|
||||
if datetime_str:
|
||||
timestamp = unified_timestamp(datetime_str) - 8 * 3600
|
||||
|
||||
return {
|
||||
'id': news_id,
|
||||
|
||||
@@ -28,7 +28,8 @@ class CWTVIE(InfoExtractor):
|
||||
'params': {
|
||||
# m3u8 download
|
||||
'skip_download': True,
|
||||
}
|
||||
},
|
||||
'skip': 'redirect to http://cwtv.com/shows/arrow/',
|
||||
}, {
|
||||
'url': 'http://www.cwseed.com/shows/whose-line-is-it-anyway/jeff-davis-4/?play=24282b12-ead2-42f2-95ad-26770c2c6088',
|
||||
'info_dict': {
|
||||
@@ -44,10 +45,6 @@ class CWTVIE(InfoExtractor):
|
||||
'upload_date': '20151006',
|
||||
'timestamp': 1444107300,
|
||||
},
|
||||
'params': {
|
||||
# m3u8 download
|
||||
'skip_download': True,
|
||||
}
|
||||
}, {
|
||||
'url': 'http://cwtv.com/thecw/chroniclesofcisco/?play=8adebe35-f447-465f-ab52-e863506ff6d6',
|
||||
'only_matching': True,
|
||||
@@ -61,11 +58,30 @@ class CWTVIE(InfoExtractor):
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
video_data = self._download_json(
|
||||
'http://metaframe.digitalsmiths.tv/v2/CWtv/assets/%s/partner/132?format=json' % video_id, video_id)
|
||||
|
||||
formats = self._extract_m3u8_formats(
|
||||
video_data['videos']['variantplaylist']['uri'], video_id, 'mp4')
|
||||
video_data = None
|
||||
formats = []
|
||||
for partner in (154, 213):
|
||||
vdata = self._download_json(
|
||||
'http://metaframe.digitalsmiths.tv/v2/CWtv/assets/%s/partner/%d?format=json' % (video_id, partner), video_id, fatal=False)
|
||||
if not vdata:
|
||||
continue
|
||||
video_data = vdata
|
||||
for quality, quality_data in vdata.get('videos', {}).items():
|
||||
quality_url = quality_data.get('uri')
|
||||
if not quality_url:
|
||||
continue
|
||||
if quality == 'variantplaylist':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
quality_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
|
||||
else:
|
||||
tbr = int_or_none(quality_data.get('bitrate'))
|
||||
format_id = 'http' + ('-%d' % tbr if tbr else '')
|
||||
if self._is_valid_url(quality_url, video_id, format_id):
|
||||
formats.append({
|
||||
'format_id': format_id,
|
||||
'url': quality_url,
|
||||
'tbr': tbr,
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
thumbnails = [{
|
||||
|
||||
@@ -3,7 +3,10 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import str_to_int
|
||||
from ..utils import (
|
||||
NO_DEFAULT,
|
||||
str_to_int,
|
||||
)
|
||||
|
||||
|
||||
class DrTuberIE(InfoExtractor):
|
||||
@@ -17,7 +20,6 @@ class DrTuberIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'hot perky blonde naked golf',
|
||||
'like_count': int,
|
||||
'dislike_count': int,
|
||||
'comment_count': int,
|
||||
'categories': ['Babe', 'Blonde', 'Erotic', 'Outdoor', 'Softcore', 'Solo'],
|
||||
'thumbnail': 're:https?://.*\.jpg$',
|
||||
@@ -36,25 +38,29 @@ class DrTuberIE(InfoExtractor):
|
||||
r'<source src="([^"]+)"', webpage, 'video URL')
|
||||
|
||||
title = self._html_search_regex(
|
||||
[r'<p[^>]+class="title_substrate">([^<]+)</p>', r'<title>([^<]+) - \d+'],
|
||||
(r'class="title_watch"[^>]*><p>([^<]+)<',
|
||||
r'<p[^>]+class="title_substrate">([^<]+)</p>',
|
||||
r'<title>([^<]+) - \d+'),
|
||||
webpage, 'title')
|
||||
|
||||
thumbnail = self._html_search_regex(
|
||||
r'poster="([^"]+)"',
|
||||
webpage, 'thumbnail', fatal=False)
|
||||
|
||||
def extract_count(id_, name):
|
||||
def extract_count(id_, name, default=NO_DEFAULT):
|
||||
return str_to_int(self._html_search_regex(
|
||||
r'<span[^>]+(?:class|id)="%s"[^>]*>([\d,\.]+)</span>' % id_,
|
||||
webpage, '%s count' % name, fatal=False))
|
||||
webpage, '%s count' % name, default=default, fatal=False))
|
||||
|
||||
like_count = extract_count('rate_likes', 'like')
|
||||
dislike_count = extract_count('rate_dislikes', 'dislike')
|
||||
dislike_count = extract_count('rate_dislikes', 'dislike', default=None)
|
||||
comment_count = extract_count('comments_count', 'comment')
|
||||
|
||||
cats_str = self._search_regex(
|
||||
r'<div[^>]+class="categories_list">(.+?)</div>', webpage, 'categories', fatal=False)
|
||||
categories = [] if not cats_str else re.findall(r'<a title="([^"]+)"', cats_str)
|
||||
r'<div[^>]+class="categories_list">(.+?)</div>',
|
||||
webpage, 'categories', fatal=False)
|
||||
categories = [] if not cats_str else re.findall(
|
||||
r'<a title="([^"]+)"', cats_str)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
|
||||
@@ -311,7 +311,6 @@ from .globo import (
|
||||
)
|
||||
from .godtube import GodTubeIE
|
||||
from .godtv import GodTVIE
|
||||
from .goldenmoustache import GoldenMoustacheIE
|
||||
from .golem import GolemIE
|
||||
from .googledrive import GoogleDriveIE
|
||||
from .googleplus import GooglePlusIE
|
||||
@@ -696,6 +695,7 @@ from .rockstargames import RockstarGamesIE
|
||||
from .roosterteeth import RoosterTeethIE
|
||||
from .rottentomatoes import RottenTomatoesIE
|
||||
from .roxwel import RoxwelIE
|
||||
from .rozhlas import RozhlasIE
|
||||
from .rtbf import RTBFIE
|
||||
from .rte import RteIE, RteRadioIE
|
||||
from .rtlnl import RtlNlIE
|
||||
@@ -755,6 +755,7 @@ from .smotri import (
|
||||
)
|
||||
from .snotr import SnotrIE
|
||||
from .sohu import SohuIE
|
||||
from .sonyliv import SonyLIVIE
|
||||
from .soundcloud import (
|
||||
SoundcloudIE,
|
||||
SoundcloudSetIE,
|
||||
@@ -927,6 +928,7 @@ from .udemy import (
|
||||
from .udn import UDNEmbedIE
|
||||
from .digiteka import DigitekaIE
|
||||
from .unistra import UnistraIE
|
||||
from .uol import UOLIE
|
||||
from .urort import UrortIE
|
||||
from .urplay import URPlayIE
|
||||
from .usatoday import USATodayIE
|
||||
|
||||
@@ -48,7 +48,7 @@ class FlipagramIE(InfoExtractor):
|
||||
flipagram = video_data['flipagram']
|
||||
video = flipagram['video']
|
||||
|
||||
json_ld = self._search_json_ld(webpage, video_id, fatal=False)
|
||||
json_ld = self._search_json_ld(webpage, video_id, default={})
|
||||
title = json_ld.get('title') or flipagram['captionText']
|
||||
description = json_ld.get('description') or flipagram.get('captionText')
|
||||
|
||||
|
||||
@@ -5,8 +5,8 @@ from .common import InfoExtractor
|
||||
|
||||
|
||||
class Formula1IE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?formula1\.com/content/fom-website/en/video/\d{4}/\d{1,2}/(?P<id>.+?)\.html'
|
||||
_TEST = {
|
||||
_VALID_URL = r'https?://(?:www\.)?formula1\.com/(?:content/fom-website/)?en/video/\d{4}/\d{1,2}/(?P<id>.+?)\.html'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.formula1.com/content/fom-website/en/video/2016/5/Race_highlights_-_Spain_2016.html',
|
||||
'md5': '8c79e54be72078b26b89e0e111c0502b',
|
||||
'info_dict': {
|
||||
@@ -15,7 +15,10 @@ class Formula1IE(InfoExtractor):
|
||||
'title': 'Race highlights - Spain 2016',
|
||||
},
|
||||
'add_ie': ['Ooyala'],
|
||||
}
|
||||
}, {
|
||||
'url': 'http://www.formula1.com/en/video/2016/5/Race_highlights_-_Spain_2016.html',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
|
||||
@@ -131,7 +131,7 @@ class PluzzIE(FranceTVBaseInfoExtractor):
|
||||
|
||||
class FranceTvInfoIE(FranceTVBaseInfoExtractor):
|
||||
IE_NAME = 'francetvinfo.fr'
|
||||
_VALID_URL = r'https?://(?:www|mobile|france3-regions)\.francetvinfo\.fr/.*/(?P<title>.+)\.html'
|
||||
_VALID_URL = r'https?://(?:www|mobile|france3-regions)\.francetvinfo\.fr/(?:[^/]+/)*(?P<title>[^/?#&.]+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.francetvinfo.fr/replay-jt/france-3/soir-3/jt-grand-soir-3-lundi-26-aout-2013_393427.html',
|
||||
@@ -206,6 +206,9 @@ class FranceTvInfoIE(FranceTVBaseInfoExtractor):
|
||||
'uploader_id': 'x2q2ez',
|
||||
},
|
||||
'add_ie': ['Dailymotion'],
|
||||
}, {
|
||||
'url': 'http://france3-regions.francetvinfo.fr/limousin/emissions/jt-1213-limousin',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
||||
@@ -2241,8 +2241,8 @@ class GenericIE(InfoExtractor):
|
||||
|
||||
# Looking for http://schema.org/VideoObject
|
||||
json_ld = self._search_json_ld(
|
||||
webpage, video_id, fatal=False, expected_type='VideoObject')
|
||||
if json_ld and json_ld.get('url'):
|
||||
webpage, video_id, default={}, expected_type='VideoObject')
|
||||
if json_ld.get('url'):
|
||||
info_dict.update({
|
||||
'title': video_title or info_dict['title'],
|
||||
'description': video_description,
|
||||
|
||||
@@ -1,48 +0,0 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class GoldenMoustacheIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?goldenmoustache\.com/(?P<display_id>[\w-]+)-(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.goldenmoustache.com/suricate-le-poker-3700/',
|
||||
'md5': '0f904432fa07da5054d6c8beb5efb51a',
|
||||
'info_dict': {
|
||||
'id': '3700',
|
||||
'ext': 'mp4',
|
||||
'title': 'Suricate - Le Poker',
|
||||
'description': 'md5:3d1f242f44f8c8cb0a106f1fd08e5dc9',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
}
|
||||
}, {
|
||||
'url': 'http://www.goldenmoustache.com/le-lab-tout-effacer-mc-fly-et-carlito-55249/',
|
||||
'md5': '27f0c50fb4dd5f01dc9082fc67cd5700',
|
||||
'info_dict': {
|
||||
'id': '55249',
|
||||
'ext': 'mp4',
|
||||
'title': 'Le LAB - Tout Effacer (Mc Fly et Carlito)',
|
||||
'description': 'md5:9b7fbf11023fb2250bd4b185e3de3b2a',
|
||||
'thumbnail': 're:^https?://.*\.(?:png|jpg)$',
|
||||
}
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
video_url = self._html_search_regex(
|
||||
r'data-src-type="mp4" data-src="([^"]+)"', webpage, 'video URL')
|
||||
title = self._html_search_regex(
|
||||
r'<title>(.*?)(?: - Golden Moustache)?</title>', webpage, 'title')
|
||||
thumbnail = self._og_search_thumbnail(webpage)
|
||||
description = self._og_search_description(webpage)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'url': video_url,
|
||||
'ext': 'mp4',
|
||||
'title': title,
|
||||
'description': description,
|
||||
'thumbnail': thumbnail,
|
||||
}
|
||||
@@ -4,6 +4,7 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_urlparse
|
||||
from ..utils import (
|
||||
get_element_by_id,
|
||||
clean_html,
|
||||
@@ -242,8 +243,9 @@ class KuwoSingerIE(InfoExtractor):
|
||||
query={'artistId': artist_id, 'pn': page_num, 'rn': self.PAGE_SIZE})
|
||||
|
||||
return [
|
||||
self.url_result(song_url, 'Kuwo') for song_url in re.findall(
|
||||
r'<div[^>]+class="name"><a[^>]+href="(http://www\.kuwo\.cn/yinyue/\d+)',
|
||||
self.url_result(compat_urlparse.urljoin(url, song_url), 'Kuwo')
|
||||
for song_url in re.findall(
|
||||
r'<div[^>]+class="name"><a[^>]+href="(/yinyue/\d+)',
|
||||
webpage)
|
||||
]
|
||||
|
||||
|
||||
@@ -1,55 +1,71 @@
|
||||
# encoding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import json
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
clean_html,
|
||||
int_or_none,
|
||||
unified_timestamp,
|
||||
update_url_query,
|
||||
)
|
||||
|
||||
|
||||
class RBMARadioIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?rbmaradio\.com/shows/(?P<videoID>[^/]+)$'
|
||||
_VALID_URL = r'https?://(?:www\.)?rbmaradio\.com/shows/(?P<show_id>[^/]+)/episodes/(?P<id>[^/?#&]+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.rbmaradio.com/shows/ford-lopatin-live-at-primavera-sound-2011',
|
||||
'url': 'https://www.rbmaradio.com/shows/main-stage/episodes/ford-lopatin-live-at-primavera-sound-2011',
|
||||
'md5': '6bc6f9bcb18994b4c983bc3bf4384d95',
|
||||
'info_dict': {
|
||||
'id': 'ford-lopatin-live-at-primavera-sound-2011',
|
||||
'ext': 'mp3',
|
||||
'uploader_id': 'ford-lopatin',
|
||||
'location': 'Spain',
|
||||
'description': 'Joel Ford and Daniel ’Oneohtrix Point Never’ Lopatin fly their midified pop extravaganza to Spain. Live at Primavera Sound 2011.',
|
||||
'uploader': 'Ford & Lopatin',
|
||||
'title': 'Live at Primavera Sound 2011',
|
||||
'title': 'Main Stage - Ford & Lopatin',
|
||||
'description': 'md5:4f340fb48426423530af5a9d87bd7b91',
|
||||
'thumbnail': 're:^https?://.*\.jpg',
|
||||
'duration': 2452,
|
||||
'timestamp': 1307103164,
|
||||
'upload_date': '20110603',
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
m = re.match(self._VALID_URL, url)
|
||||
video_id = m.group('videoID')
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
show_id = mobj.group('show_id')
|
||||
episode_id = mobj.group('id')
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
webpage = self._download_webpage(url, episode_id)
|
||||
|
||||
json_data = self._search_regex(r'window\.gon.*?gon\.show=(.+?);$',
|
||||
webpage, 'json data', flags=re.MULTILINE)
|
||||
episode = self._parse_json(
|
||||
self._search_regex(
|
||||
r'__INITIAL_STATE__\s*=\s*({.+?})\s*</script>',
|
||||
webpage, 'json data'),
|
||||
episode_id)['episodes'][show_id][episode_id]
|
||||
|
||||
try:
|
||||
data = json.loads(json_data)
|
||||
except ValueError as e:
|
||||
raise ExtractorError('Invalid JSON: ' + str(e))
|
||||
title = episode['title']
|
||||
|
||||
video_url = data['akamai_url'] + '&cbr=256'
|
||||
show_title = episode.get('showTitle')
|
||||
if show_title:
|
||||
title = '%s - %s' % (show_title, title)
|
||||
|
||||
formats = [{
|
||||
'url': update_url_query(episode['audioURL'], query={'cbr': abr}),
|
||||
'format_id': compat_str(abr),
|
||||
'abr': abr,
|
||||
'vcodec': 'none',
|
||||
} for abr in (96, 128, 256)]
|
||||
|
||||
description = clean_html(episode.get('longTeaser'))
|
||||
thumbnail = self._proto_relative_url(episode.get('imageURL', {}).get('landscape'))
|
||||
duration = int_or_none(episode.get('duration'))
|
||||
timestamp = unified_timestamp(episode.get('publishedAt'))
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'url': video_url,
|
||||
'title': data['title'],
|
||||
'description': data.get('teaser_text'),
|
||||
'location': data.get('country_of_origin'),
|
||||
'uploader': data.get('host', {}).get('name'),
|
||||
'uploader_id': data.get('host', {}).get('slug'),
|
||||
'thumbnail': data.get('image', {}).get('large_url_2x'),
|
||||
'duration': data.get('duration'),
|
||||
'id': episode_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'thumbnail': thumbnail,
|
||||
'duration': duration,
|
||||
'timestamp': timestamp,
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
50
youtube_dl/extractor/rozhlas.py
Normal file
50
youtube_dl/extractor/rozhlas.py
Normal file
@@ -0,0 +1,50 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
remove_start,
|
||||
)
|
||||
|
||||
|
||||
class RozhlasIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?prehravac\.rozhlas\.cz/audio/(?P<id>[0-9]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://prehravac.rozhlas.cz/audio/3421320',
|
||||
'md5': '504c902dbc9e9a1fd50326eccf02a7e2',
|
||||
'info_dict': {
|
||||
'id': '3421320',
|
||||
'ext': 'mp3',
|
||||
'title': 'Echo Pavla Klusáka (30.06.2015 21:00)',
|
||||
'description': 'Osmdesátiny Terryho Rileyho jsou skvělou příležitostí proletět se elektronickými i akustickými díly zakladatatele minimalismu, který je aktivní už přes padesát let'
|
||||
}
|
||||
}, {
|
||||
'url': 'http://prehravac.rozhlas.cz/audio/3421320/embed',
|
||||
'skip_download': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
audio_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(
|
||||
'http://prehravac.rozhlas.cz/audio/%s' % audio_id, audio_id)
|
||||
|
||||
title = self._html_search_regex(
|
||||
r'<h3>(.+?)</h3>\s*<p[^>]*>.*?</p>\s*<div[^>]+id=["\']player-track',
|
||||
webpage, 'title', default=None) or remove_start(
|
||||
self._og_search_title(webpage), 'Radio Wave - ')
|
||||
description = self._html_search_regex(
|
||||
r'<p[^>]+title=(["\'])(?P<url>(?:(?!\1).)+)\1[^>]*>.*?</p>\s*<div[^>]+id=["\']player-track',
|
||||
webpage, 'description', fatal=False, group='url')
|
||||
duration = int_or_none(self._search_regex(
|
||||
r'data-duration=["\'](\d+)', webpage, 'duration', default=None))
|
||||
|
||||
return {
|
||||
'id': audio_id,
|
||||
'url': 'http://media.rozhlas.cz/_audio/%s.mp3' % audio_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'duration': duration,
|
||||
'vcodec': 'none',
|
||||
}
|
||||
@@ -14,7 +14,7 @@ class RtlNlIE(InfoExtractor):
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://(?:www\.)?
|
||||
(?:
|
||||
rtlxl\.nl/\#!/[^/]+/|
|
||||
rtlxl\.nl/[^\#]*\#!/[^/]+/|
|
||||
rtl\.nl/system/videoplayer/(?:[^/]+/)+(?:video_)?embed\.html\b.+?\buuid=
|
||||
)
|
||||
(?P<id>[0-9a-f-]+)'''
|
||||
@@ -67,6 +67,9 @@ class RtlNlIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://www.rtl.nl/system/videoplayer/derden/embed.html#!/uuid=bb0353b0-d6a4-1dad-90e9-18fe75b8d1f0',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://rtlxl.nl/?_ga=1.204735956.572365465.1466978370#!/rtl-nieuws-132237/3c487912-023b-49ac-903e-2c5d79f8410f',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
||||
@@ -14,10 +14,10 @@ from ..utils import ExtractorError
|
||||
class SohuIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?P<mytv>my\.)?tv\.sohu\.com/.+?/(?(mytv)|n)(?P<id>\d+)\.shtml.*?'
|
||||
|
||||
# Sohu videos give different MD5 sums on Travis CI and my machine
|
||||
_TESTS = [{
|
||||
'note': 'This video is available only in Mainland China',
|
||||
'url': 'http://tv.sohu.com/20130724/n382479172.shtml#super',
|
||||
'md5': '29175c8cadd8b5cc4055001e85d6b372',
|
||||
'info_dict': {
|
||||
'id': '382479172',
|
||||
'ext': 'mp4',
|
||||
@@ -26,7 +26,6 @@ class SohuIE(InfoExtractor):
|
||||
'skip': 'On available in China',
|
||||
}, {
|
||||
'url': 'http://tv.sohu.com/20150305/n409385080.shtml',
|
||||
'md5': '699060e75cf58858dd47fb9c03c42cfb',
|
||||
'info_dict': {
|
||||
'id': '409385080',
|
||||
'ext': 'mp4',
|
||||
@@ -34,7 +33,6 @@ class SohuIE(InfoExtractor):
|
||||
}
|
||||
}, {
|
||||
'url': 'http://my.tv.sohu.com/us/232799889/78693464.shtml',
|
||||
'md5': '9bf34be48f2f4dadcb226c74127e203c',
|
||||
'info_dict': {
|
||||
'id': '78693464',
|
||||
'ext': 'mp4',
|
||||
@@ -48,7 +46,6 @@ class SohuIE(InfoExtractor):
|
||||
'title': '【神探苍实战秘籍】第13期 战争之影 赫卡里姆',
|
||||
},
|
||||
'playlist': [{
|
||||
'md5': 'bdbfb8f39924725e6589c146bc1883ad',
|
||||
'info_dict': {
|
||||
'id': '78910339_part1',
|
||||
'ext': 'mp4',
|
||||
@@ -56,7 +53,6 @@ class SohuIE(InfoExtractor):
|
||||
'title': '【神探苍实战秘籍】第13期 战争之影 赫卡里姆',
|
||||
}
|
||||
}, {
|
||||
'md5': '3e1f46aaeb95354fd10e7fca9fc1804e',
|
||||
'info_dict': {
|
||||
'id': '78910339_part2',
|
||||
'ext': 'mp4',
|
||||
@@ -64,7 +60,6 @@ class SohuIE(InfoExtractor):
|
||||
'title': '【神探苍实战秘籍】第13期 战争之影 赫卡里姆',
|
||||
}
|
||||
}, {
|
||||
'md5': '8407e634175fdac706766481b9443450',
|
||||
'info_dict': {
|
||||
'id': '78910339_part3',
|
||||
'ext': 'mp4',
|
||||
|
||||
34
youtube_dl/extractor/sonyliv.py
Normal file
34
youtube_dl/extractor/sonyliv.py
Normal file
@@ -0,0 +1,34 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class SonyLIVIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?sonyliv\.com/details/[^/]+/(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': "http://www.sonyliv.com/details/episodes/5024612095001/Ep.-1---Achaari-Cheese-Toast---Bachelor's-Delight",
|
||||
'info_dict': {
|
||||
'title': "Ep. 1 - Achaari Cheese Toast - Bachelor's Delight",
|
||||
'id': '5024612095001',
|
||||
'ext': 'mp4',
|
||||
'upload_date': '20160707',
|
||||
'description': 'md5:7f28509a148d5be9d0782b4d5106410d',
|
||||
'uploader_id': '4338955589001',
|
||||
'timestamp': 1467870968,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
'add_ie': ['BrightcoveNew'],
|
||||
}, {
|
||||
'url': 'http://www.sonyliv.com/details/full%20movie/4951168986001/Sei-Raat-(Bangla)',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/4338955589001/default_default/index.html?videoId=%s'
|
||||
|
||||
def _real_extract(self, url):
|
||||
brightcove_id = self._match_id(url)
|
||||
return self.url_result(
|
||||
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew', brightcove_id)
|
||||
128
youtube_dl/extractor/uol.py
Normal file
128
youtube_dl/extractor/uol.py
Normal file
@@ -0,0 +1,128 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
clean_html,
|
||||
int_or_none,
|
||||
parse_duration,
|
||||
update_url_query,
|
||||
str_or_none,
|
||||
)
|
||||
|
||||
|
||||
class UOLIE(InfoExtractor):
|
||||
IE_NAME = 'uol.com.br'
|
||||
_VALID_URL = r'https?://(?:.+?\.)?uol\.com\.br/.*?(?:(?:mediaId|v)=|view/(?:[a-z0-9]+/)?|video(?:=|/(?:\d{4}/\d{2}/\d{2}/)?))(?P<id>\d+|[\w-]+-[A-Z0-9]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://player.mais.uol.com.br/player_video_v3.swf?mediaId=15951931',
|
||||
'md5': '25291da27dc45e0afb5718a8603d3816',
|
||||
'info_dict': {
|
||||
'id': '15951931',
|
||||
'ext': 'mp4',
|
||||
'title': 'Miss simpatia é encontrada morta',
|
||||
'description': 'md5:3f8c11a0c0556d66daf7e5b45ef823b2',
|
||||
}
|
||||
}, {
|
||||
'url': 'http://tvuol.uol.com.br/video/incendio-destroi-uma-das-maiores-casas-noturnas-de-londres-04024E9A3268D4C95326',
|
||||
'md5': 'e41a2fb7b7398a3a46b6af37b15c00c9',
|
||||
'info_dict': {
|
||||
'id': '15954259',
|
||||
'ext': 'mp4',
|
||||
'title': 'Incêndio destrói uma das maiores casas noturnas de Londres',
|
||||
'description': 'Em Londres, um incêndio destruiu uma das maiores boates da cidade. Não há informações sobre vítimas.',
|
||||
}
|
||||
}, {
|
||||
'url': 'http://mais.uol.com.br/static/uolplayer/index.html?mediaId=15951931',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://mais.uol.com.br/view/15954259',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://noticias.band.uol.com.br/brasilurgente/video/2016/08/05/15951931/miss-simpatia-e-encontrada-morta.html',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://videos.band.uol.com.br/programa.asp?e=noticias&pr=brasil-urgente&v=15951931&t=Policia-desmonte-base-do-PCC-na-Cracolandia',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://mais.uol.com.br/view/cphaa0gl2x8r/incendio-destroi-uma-das-maiores-casas-noturnas-de-londres-04024E9A3268D4C95326',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://noticias.uol.com.br//videos/assistir.htm?video=rafaela-silva-inspira-criancas-no-judo-04024D983968D4C95326',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://mais.uol.com.br/view/e0qbgxid79uv/15275470',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
_FORMATS = {
|
||||
'2': {
|
||||
'width': 640,
|
||||
'height': 360,
|
||||
},
|
||||
'5': {
|
||||
'width': 1080,
|
||||
'height': 720,
|
||||
},
|
||||
'6': {
|
||||
'width': 426,
|
||||
'height': 240,
|
||||
},
|
||||
'7': {
|
||||
'width': 1920,
|
||||
'height': 1080,
|
||||
},
|
||||
'8': {
|
||||
'width': 192,
|
||||
'height': 144,
|
||||
},
|
||||
'9': {
|
||||
'width': 568,
|
||||
'height': 320,
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
if not video_id.isdigit():
|
||||
embed_page = self._download_webpage('https://jsuol.com.br/c/tv/uol/embed/?params=[embed,%s]' % video_id, video_id)
|
||||
video_id = self._search_regex(r'mediaId=(\d+)', embed_page, 'media id')
|
||||
video_data = self._download_json(
|
||||
'http://mais.uol.com.br/apiuol/v3/player/getMedia/%s.json' % video_id,
|
||||
video_id)['item']
|
||||
title = video_data['title']
|
||||
|
||||
query = {
|
||||
'ver': video_data.get('numRevision', 2),
|
||||
'r': 'http://mais.uol.com.br',
|
||||
}
|
||||
formats = []
|
||||
for f in video_data.get('formats', []):
|
||||
f_url = f.get('url') or f.get('secureUrl')
|
||||
if not f_url:
|
||||
continue
|
||||
format_id = str_or_none(f.get('id'))
|
||||
fmt = {
|
||||
'format_id': format_id,
|
||||
'url': update_url_query(f_url, query),
|
||||
}
|
||||
fmt.update(self._FORMATS.get(format_id, {}))
|
||||
formats.append(fmt)
|
||||
self._sort_formats(formats)
|
||||
|
||||
tags = []
|
||||
for tag in video_data.get('tags', []):
|
||||
tag_description = tag.get('description')
|
||||
if not tag_description:
|
||||
continue
|
||||
tags.append(tag_description)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': clean_html(video_data.get('desMedia')),
|
||||
'thumbnail': video_data.get('thumbnail'),
|
||||
'duration': int_or_none(video_data.get('durationSeconds')) or parse_duration(video_data.get('duration')),
|
||||
'tags': tags,
|
||||
'formats': formats,
|
||||
}
|
||||
@@ -9,6 +9,7 @@ from ..utils import (
|
||||
ExtractorError,
|
||||
unified_strdate,
|
||||
HEADRequest,
|
||||
int_or_none,
|
||||
)
|
||||
|
||||
|
||||
@@ -30,48 +31,58 @@ class WatIE(InfoExtractor):
|
||||
},
|
||||
{
|
||||
'url': 'http://www.wat.tv/video/gregory-lemarchal-voix-ange-6z1v7_6ygkj_.html',
|
||||
'md5': 'fbc84e4378165278e743956d9c1bf16b',
|
||||
'md5': '34bdfa5ca9fd3c7eb88601b635b0424c',
|
||||
'info_dict': {
|
||||
'id': '11713075',
|
||||
'ext': 'mp4',
|
||||
'title': 'Grégory Lemarchal, une voix d\'ange depuis 10 ans (1/3)',
|
||||
'description': 'md5:b7a849cf16a2b733d9cd10c52906dee3',
|
||||
'upload_date': '20140816',
|
||||
'duration': 2910,
|
||||
},
|
||||
'skip': "Ce contenu n'est pas disponible pour l'instant.",
|
||||
'expected_warnings': ["Ce contenu n'est pas disponible pour l'instant."],
|
||||
},
|
||||
]
|
||||
|
||||
_FORMATS = (
|
||||
(200, 416, 234),
|
||||
(400, 480, 270),
|
||||
(600, 640, 360),
|
||||
(1200, 640, 360),
|
||||
(1800, 960, 540),
|
||||
(2500, 1280, 720),
|
||||
)
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
video_id = video_id if video_id.isdigit() and len(video_id) > 6 else compat_str(int(video_id, 36))
|
||||
|
||||
# 'contentv4' is used in the website, but it also returns the related
|
||||
# videos, we don't need them
|
||||
video_info = self._download_json(
|
||||
'http://www.wat.tv/interface/contentv3/' + video_id, video_id)['media']
|
||||
video_data = self._download_json(
|
||||
'http://www.wat.tv/interface/contentv4s/' + video_id, video_id)
|
||||
video_info = video_data['media']
|
||||
|
||||
error_desc = video_info.get('error_desc')
|
||||
if error_desc:
|
||||
raise ExtractorError(
|
||||
'%s returned error: %s' % (self.IE_NAME, error_desc), expected=True)
|
||||
self.report_warning(
|
||||
'%s returned error: %s' % (self.IE_NAME, error_desc))
|
||||
|
||||
chapters = video_info['chapters']
|
||||
first_chapter = chapters[0]
|
||||
if chapters:
|
||||
first_chapter = chapters[0]
|
||||
|
||||
def video_id_for_chapter(chapter):
|
||||
return chapter['tc_start'].split('-')[0]
|
||||
def video_id_for_chapter(chapter):
|
||||
return chapter['tc_start'].split('-')[0]
|
||||
|
||||
if video_id_for_chapter(first_chapter) != video_id:
|
||||
self.to_screen('Multipart video detected')
|
||||
entries = [self.url_result('wat:%s' % video_id_for_chapter(chapter)) for chapter in chapters]
|
||||
return self.playlist_result(entries, video_id, video_info['title'])
|
||||
# Otherwise we can continue and extract just one part, we have to use
|
||||
# the video id for getting the video url
|
||||
if video_id_for_chapter(first_chapter) != video_id:
|
||||
self.to_screen('Multipart video detected')
|
||||
entries = [self.url_result('wat:%s' % video_id_for_chapter(chapter)) for chapter in chapters]
|
||||
return self.playlist_result(entries, video_id, video_info['title'])
|
||||
# Otherwise we can continue and extract just one part, we have to use
|
||||
# the video id for getting the video url
|
||||
else:
|
||||
first_chapter = video_info
|
||||
|
||||
date_diffusion = first_chapter.get('date_diffusion')
|
||||
upload_date = unified_strdate(date_diffusion) if date_diffusion else None
|
||||
title = first_chapter['title']
|
||||
|
||||
def extract_url(path_template, url_type):
|
||||
req_url = 'http://www.wat.tv/get/%s' % (path_template % video_id)
|
||||
@@ -83,36 +94,61 @@ class WatIE(InfoExtractor):
|
||||
expected=True)
|
||||
return red_url
|
||||
|
||||
m3u8_url = extract_url('ipad/%s.m3u8', 'm3u8')
|
||||
http_url = extract_url('android5/%s.mp4', 'http')
|
||||
|
||||
formats = []
|
||||
m3u8_formats = self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls')
|
||||
formats.extend(m3u8_formats)
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
m3u8_url.replace('ios.', 'web.').replace('.m3u8', '.f4m'),
|
||||
video_id, f4m_id='hds', fatal=False))
|
||||
for m3u8_format in m3u8_formats:
|
||||
vbr, abr = m3u8_format.get('vbr'), m3u8_format.get('abr')
|
||||
if not vbr or not abr:
|
||||
continue
|
||||
f = m3u8_format.copy()
|
||||
f.update({
|
||||
'url': re.sub(r'%s-\d+00-\d+' % video_id, '%s-%d00-%d' % (video_id, round(vbr / 100), round(abr)), http_url),
|
||||
'format_id': f['format_id'].replace('hls', 'http'),
|
||||
'protocol': 'http',
|
||||
})
|
||||
formats.append(f)
|
||||
self._sort_formats(formats)
|
||||
try:
|
||||
http_url = extract_url('android5/%s.mp4', 'http')
|
||||
m3u8_url = extract_url('ipad/%s.m3u8', 'm3u8')
|
||||
m3u8_formats = self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls')
|
||||
formats.extend(m3u8_formats)
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
m3u8_url.replace('ios.', 'web.').replace('.m3u8', '.f4m'),
|
||||
video_id, f4m_id='hds', fatal=False))
|
||||
for m3u8_format in m3u8_formats:
|
||||
vbr, abr = m3u8_format.get('vbr'), m3u8_format.get('abr')
|
||||
if not vbr or not abr:
|
||||
continue
|
||||
format_id = m3u8_format['format_id'].replace('hls', 'http')
|
||||
fmt_url = re.sub(r'%s-\d+00-\d+' % video_id, '%s-%d00-%d' % (video_id, round(vbr / 100), round(abr)), http_url)
|
||||
if self._is_valid_url(fmt_url, video_id, format_id):
|
||||
f = m3u8_format.copy()
|
||||
f.update({
|
||||
'url': fmt_url,
|
||||
'format_id': format_id,
|
||||
'protocol': 'http',
|
||||
})
|
||||
formats.append(f)
|
||||
self._sort_formats(formats)
|
||||
except ExtractorError:
|
||||
abr = 64
|
||||
for vbr, width, height in self._FORMATS:
|
||||
tbr = vbr + abr
|
||||
format_id = 'http-%s' % tbr
|
||||
fmt_url = 'http://dnl.adv.tf1.fr/2/USP-0x0/%s/%s/%s/ssm/%s-%s-64k.mp4' % (video_id[-4:-2], video_id[-2:], video_id, video_id, vbr)
|
||||
if self._is_valid_url(fmt_url, video_id, format_id):
|
||||
formats.append({
|
||||
'format_id': format_id,
|
||||
'url': fmt_url,
|
||||
'vbr': vbr,
|
||||
'abr': abr,
|
||||
'width': width,
|
||||
'height': height,
|
||||
})
|
||||
|
||||
date_diffusion = first_chapter.get('date_diffusion') or video_data.get('configv4', {}).get('estatS4')
|
||||
upload_date = unified_strdate(date_diffusion) if date_diffusion else None
|
||||
duration = None
|
||||
files = video_info['files']
|
||||
if files:
|
||||
duration = int_or_none(files[0].get('duration'))
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': first_chapter['title'],
|
||||
'thumbnail': first_chapter['preview'],
|
||||
'description': first_chapter['description'],
|
||||
'view_count': video_info['views'],
|
||||
'title': title,
|
||||
'thumbnail': first_chapter.get('preview'),
|
||||
'description': first_chapter.get('description'),
|
||||
'view_count': int_or_none(video_info.get('views')),
|
||||
'upload_date': upload_date,
|
||||
'duration': video_info['files'][0]['duration'],
|
||||
'duration': duration,
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
@@ -499,9 +499,20 @@ def parseOpts(overrideArguments=None):
|
||||
dest='bidi_workaround', action='store_true',
|
||||
help='Work around terminals that lack bidirectional text support. Requires bidiv or fribidi executable in PATH')
|
||||
workarounds.add_option(
|
||||
'--sleep-interval', metavar='SECONDS',
|
||||
'--sleep-interval', '--min-sleep-interval', metavar='SECONDS',
|
||||
dest='sleep_interval', type=float,
|
||||
help='Number of seconds to sleep before each download.')
|
||||
help=(
|
||||
'Number of seconds to sleep before each download when used alone '
|
||||
'or a lower bound of a range for randomized sleep before each download '
|
||||
'(minimum possible number of seconds to sleep) when used along with '
|
||||
'--max-sleep-interval.'))
|
||||
workarounds.add_option(
|
||||
'--max-sleep-interval', metavar='SECONDS',
|
||||
dest='max_sleep_interval', type=float,
|
||||
help=(
|
||||
'Upper bound of a range for randomized sleep before each download '
|
||||
'(maximum possible number of seconds to sleep). Must only be used '
|
||||
'along with --min-sleep-interval.'))
|
||||
|
||||
verbosity = optparse.OptionGroup(parser, 'Verbosity / Simulation Options')
|
||||
verbosity.add_option(
|
||||
|
||||
@@ -3,11 +3,6 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import PostProcessor
|
||||
from ..utils import PostProcessingError
|
||||
|
||||
|
||||
class MetadataFromTitlePPError(PostProcessingError):
|
||||
pass
|
||||
|
||||
|
||||
class MetadataFromTitlePP(PostProcessor):
|
||||
@@ -38,7 +33,8 @@ class MetadataFromTitlePP(PostProcessor):
|
||||
title = info['title']
|
||||
match = re.match(self._titleregex, title)
|
||||
if match is None:
|
||||
raise MetadataFromTitlePPError('Could not interpret title of video as "%s"' % self._titleformat)
|
||||
self._downloader.to_screen('[fromtitle] Could not interpret title of video as "%s"' % self._titleformat)
|
||||
return [], info
|
||||
for attribute, value in match.groupdict().items():
|
||||
value = match.group(attribute)
|
||||
info[attribute] = value
|
||||
|
||||
@@ -122,6 +122,7 @@ DATE_FORMATS = (
|
||||
'%Y %m %d',
|
||||
'%Y-%m-%d',
|
||||
'%Y/%m/%d',
|
||||
'%Y/%m/%d %H:%M',
|
||||
'%Y/%m/%d %H:%M:%S',
|
||||
'%Y-%m-%d %H:%M:%S',
|
||||
'%Y-%m-%d %H:%M:%S.%f',
|
||||
|
||||
@@ -1,3 +1,3 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
__version__ = '2016.08.07'
|
||||
__version__ = '2016.08.12'
|
||||
|
||||
Reference in New Issue
Block a user