release 2016.04.01

[cbs] improve extraction(closes #6321 )
[generic] remove sbnation test(handled by VoxMediaIE)
2026-03-28 10:48:50 +00:00 · 2016-04-01 09:07:40 +02:00 · 2016-04-01 07:33:37 +01:00 · 2016-03-31 23:50:45 +01:00 · 2016-03-31 23:33:36 +01:00 · 2016-04-01 02:24:22 +06:00
68 changed files with 923 additions and 332 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@@ -0,0 +1,58 @@
+## Please follow the guide below
+
+- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
+- Put an `x` into all the boxes [ ] relevant to your *issue* (like that [x])
+- Use *Preview* tab to see how your issue will actually look like
+
+---
+
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.04.01*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.04.01**
+
+### Before submitting an *issue* make sure you have:
+- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
+- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
+
+### What is the purpose of your *issue*?
+- [ ] Bug report (encountered problems with youtube-dl)
+- [ ] Site support request (request for adding support for a new site)
+- [ ] Feature request (request for a new functionality)
+- [ ] Question
+- [ ] Other
+
+---
+
+### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
+
+---
+
+### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
+
+Add `-v` flag to **your command line** you run youtube-dl with, copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
+```
+$ youtube-dl -v <your command line>
+[debug] System config: []
+[debug] User config: []
+[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
+[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
+[debug] youtube-dl version 2016.04.01
+[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
+[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
+[debug] Proxy map: {}
+...
+<end of log>
+```
+
+---
+
+### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
+- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
+- Single video: https://youtu.be/BaW_jenozKc
+- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
+
+---
+
+### Description of your *issue*, suggested solution and other information
+
+Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/rg3/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
+If work on your *issue* required an account credentials please provide them or explain how one can obtain them.
--- a/.github/ISSUE_TEMPLATE_tmpl.md
+++ b/.github/ISSUE_TEMPLATE_tmpl.md
@@ -0,0 +1,58 @@
+## Please follow the guide below
+
+- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
+- Put an `x` into all the boxes [ ] relevant to your *issue* (like that [x])
+- Use *Preview* tab to see how your issue will actually look like
+
+---
+
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *%(version)s*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **%(version)s**
+
+### Before submitting an *issue* make sure you have:
+- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
+- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
+
+### What is the purpose of your *issue*?
+- [ ] Bug report (encountered problems with youtube-dl)
+- [ ] Site support request (request for adding support for a new site)
+- [ ] Feature request (request for a new functionality)
+- [ ] Question
+- [ ] Other
+
+---
+
+### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
+
+---
+
+### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
+
+Add `-v` flag to **your command line** you run youtube-dl with, copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
+```
+$ youtube-dl -v <your command line>
+[debug] System config: []
+[debug] User config: []
+[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
+[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
+[debug] youtube-dl version %(version)s
+[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
+[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
+[debug] Proxy map: {}
+...
+<end of log>
+```
+
+---
+
+### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
+- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
+- Single video: https://youtu.be/BaW_jenozKc
+- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
+
+---
+
+### Description of your *issue*, suggested solution and other information
+
+Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/rg3/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
+If work on your *issue* required an account credentials please provide them or explain how one can obtain them.
--- a/7
+++ b/7
@@ -1,7 +1,7 @@
-all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
+all: youtube-dl README.md CONTRIBUTING.md ISSUE_TEMPLATE.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites

 clean:
-	rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json *.mp4 *.flv *.mp3 *.avi CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe
+	rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json *.mp4 *.flv *.mp3 *.avi CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
 	find . -name "*.pyc" -delete
 	find . -name "*.class" -delete

@@ -59,6 +59,9 @@ README.md: youtube_dl/*.py youtube_dl/*/*.py
 CONTRIBUTING.md: README.md
 	$(PYTHON) devscripts/make_contributing.py README.md CONTRIBUTING.md

+ISSUE_TEMPLATE.md:
+	$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl.md .github/ISSUE_TEMPLATE.md
+
 supportedsites:
 	$(PYTHON) devscripts/make_supportedsites.py docs/supportedsites.md

--- a/README.md
+++ b/README.md
@@ -600,6 +600,7 @@ Also filtering work for comparisons `=` (equals), `!=` (not equals), `^=` (begin
 - `vcodec`: Name of the video codec in use
 - `container`: Name of the container format
 - `protocol`: The protocol that will be used for the actual download, lower-case. `http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `m3u8`, or `m3u8_native`
+ - `format_id`: A short description of the format

 Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by video hoster.

--- a/devscripts/make_issue_template.py
+++ b/devscripts/make_issue_template.py
@@ -0,0 +1,29 @@
+#!/usr/bin/env python
+from __future__ import unicode_literals
+
+import io
+import optparse
+
+
+def main():
+    parser = optparse.OptionParser(usage='%prog INFILE OUTFILE')
+    options, args = parser.parse_args()
+    if len(args) != 2:
+        parser.error('Expected an input and an output filename')
+
+    infile, outfile = args
+
+    with io.open(infile, encoding='utf-8') as inf:
+        issue_template_tmpl = inf.read()
+
+    # Get the version from youtube_dl/version.py without importing the package
+    exec(compile(open('youtube_dl/version.py').read(),
+                 'youtube_dl/version.py', 'exec'))
+
+    out = issue_template_tmpl % {'version': locals()['__version__']}
+
+    with io.open(outfile, 'w', encoding='utf-8') as outf:
+        outf.write(out)
+
+if __name__ == '__main__':
+    main()
--- a/devscripts/release.sh
+++ b/devscripts/release.sh
@@ -45,9 +45,9 @@ fi
 /bin/echo -e "\n### Changing version in version.py..."
 sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py

-/bin/echo -e "\n### Committing documentation and youtube_dl/version.py..."
-make README.md CONTRIBUTING.md supportedsites
-git add README.md CONTRIBUTING.md docs/supportedsites.md youtube_dl/version.py
+/bin/echo -e "\n### Committing documentation, templates and youtube_dl/version.py..."
+make README.md CONTRIBUTING.md ISSUE_TEMPLATE.md supportedsites
+git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md docs/supportedsites.md youtube_dl/version.py
 git commit -m "release $version"

 /bin/echo -e "\n### Now tagging, signing and pushing..."
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@@ -118,6 +118,7 @@
 - **Clubic**
 - **Clyp**
 - **cmt.com**
+ - **CNBC**
 - **CNET**
 - **CNN**
 - **CNNArticle**
@@ -134,6 +135,7 @@
 - **CrooksAndLiars**
 - **Crunchyroll**
 - **crunchyroll:playlist**
+ - **CSNNE**
 - **CSpan**: C-SPAN
 - **CtsNews**: 華視新聞
 - **culturebox.francetvinfo.fr**
@@ -376,7 +378,8 @@
 - **myvideo** (Currently broken)
 - **MyVidster**
 - **n-tv.de**
- - **NationalGeographic**
+ - **natgeo**
+ - **natgeo:channel**
 - **Naver**
 - **NBA**
 - **NBC**
@@ -618,7 +621,6 @@
 - **Telegraaf**
 - **TeleMB**
 - **TeleTask**
- - **TenPlay**
 - **TF1**
 - **TheIntercept**
 - **TheOnion**
@@ -740,6 +742,7 @@
 - **vlive**
 - **Vodlocker**
 - **VoiceRepublic**
+ - **VoxMedia**
 - **Vporn**
 - **vpro**: npo.nl and ntr.nl
 - **VRT**
--- a/setup.cfg
+++ b/setup.cfg
@@ -2,5 +2,5 @@
 universal = True

 [flake8]
-exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,setup.py,build,.git
+exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,devscripts/make_issue_template.py,setup.py,build,.git
 ignore = E402,E501,E731
--- a/youtube_dl/YoutubeDL.py
+++ b/youtube_dl/YoutubeDL.py
@@ -39,6 +39,8 @@ from .compat import (
    compat_urllib_request_DataHandler,
 )
 from .utils import (
+    age_restricted,
+    args_to_str,
    ContentTooShortError,
    date_from_str,
    DateRange,
@@ -58,13 +60,16 @@ from .utils import (
    PagedList,
    parse_filesize,
    PerRequestProxyHandler,
-    PostProcessingError,
    platform_name,
+    PostProcessingError,
    preferredencoding,
+    prepend_extension,
    render_table,
+    replace_extension,
    SameFileError,
    sanitize_filename,
    sanitize_path,
+    sanitize_url,
    sanitized_Request,
    std_headers,
    subtitles_filename,
@@ -75,10 +80,6 @@ from .utils import (
    write_string,
    YoutubeDLCookieProcessor,
    YoutubeDLHandler,
-    prepend_extension,
-    replace_extension,
-    args_to_str,
-    age_restricted,
 )
 from .cache import Cache
 from .extractor import get_info_extractor, gen_extractors
@@ -1229,6 +1230,7 @@ class YoutubeDL(object):
                t.get('preference'), t.get('width'), t.get('height'),
                t.get('id'), t.get('url')))
            for i, t in enumerate(thumbnails):
+                t['url'] = sanitize_url(t['url'])
                if t.get('width') and t.get('height'):
                    t['resolution'] = '%dx%d' % (t['width'], t['height'])
                if t.get('id') is None:
@@ -1263,6 +1265,8 @@ class YoutubeDL(object):
        if subtitles:
            for _, subtitle in subtitles.items():
                for subtitle_format in subtitle:
+                    if subtitle_format.get('url'):
+                        subtitle_format['url'] = sanitize_url(subtitle_format['url'])
                    if 'ext' not in subtitle_format:
                        subtitle_format['ext'] = determine_ext(subtitle_format['url']).lower()

@@ -1292,6 +1296,8 @@ class YoutubeDL(object):
            if 'url' not in format:
                raise ExtractorError('Missing "url" key in result (index %d)' % i)

+            format['url'] = sanitize_url(format['url'])
+
            if format.get('format_id') is None:
                format['format_id'] = compat_str(i)
            else:
--- a/youtube_dl/downloader/f4m.py
+++ b/youtube_dl/downloader/f4m.py
@@ -223,6 +223,12 @@ def write_metadata_tag(stream, metadata):
        write_unsigned_int(stream, FLV_TAG_HEADER_LEN + len(metadata))


+def remove_encrypted_media(media):
+    return list(filter(lambda e: 'drmAdditionalHeaderId' not in e.attrib and
+                                 'drmAdditionalHeaderSetId' not in e.attrib,
+                       media))
+
+
 def _add_ns(prop):
    return '{http://ns.adobe.com/f4m/1.0}%s' % prop

@@ -244,9 +250,7 @@ class F4mFD(FragmentFD):
            # without drmAdditionalHeaderId or drmAdditionalHeaderSetId attribute
            if 'id' not in e.attrib:
                self.report_error('Missing ID in f4m DRM')
-        media = list(filter(lambda e: 'drmAdditionalHeaderId' not in e.attrib and
-                                      'drmAdditionalHeaderSetId' not in e.attrib,
-                            media))
+        media = remove_encrypted_media(media)
        if not media:
            self.report_error('Unsupported DRM')
        return media
--- a/youtube_dl/extractor/init.py
+++ b/youtube_dl/extractor/init.py
@@ -127,6 +127,7 @@ from .cloudy import CloudyIE
 from .clubic import ClubicIE
 from .clyp import ClypIE
 from .cmt import CMTIE
+from .cnbc import CNBCIE
 from .cnet import CNETIE
 from .cnn import (
    CNNIE,
@@ -437,10 +438,14 @@ from .myspass import MySpassIE
 from .myvi import MyviIE
 from .myvideo import MyVideoIE
 from .myvidster import MyVidsterIE
-from .nationalgeographic import NationalGeographicIE
+from .nationalgeographic import (
+    NationalGeographicIE,
+    NationalGeographicChannelIE,
+)
 from .naver import NaverIE
 from .nba import NBAIE
 from .nbc import (
+    CSNNEIE,
    NBCIE,
    NBCNewsIE,
    NBCSportsIE,
@@ -735,7 +740,6 @@ from .telecinco import TelecincoIE
 from .telegraaf import TelegraafIE
 from .telemb import TeleMBIE
 from .teletask import TeleTaskIE
-from .tenplay import TenPlayIE
 from .testurl import TestURLIE
 from .tf1 import TF1IE
 from .theintercept import TheInterceptIE
@@ -900,6 +904,7 @@ from .vk import (
 from .vlive import VLiveIE
 from .vodlocker import VodlockerIE
 from .voicerepublic import VoiceRepublicIE
+from .voxmedia import VoxMediaIE
 from .vporn import VpornIE
 from .vrt import VRTIE
 from .vube import VubeIE
--- a/youtube_dl/extractor/abc7news.py
+++ b/youtube_dl/extractor/abc7news.py
@@ -44,6 +44,7 @@ class Abc7NewsIE(InfoExtractor):
            'contentURL', webpage, 'm3u8 url', fatal=True)

        formats = self._extract_m3u8_formats(m3u8, display_id, 'mp4')
+        self._sort_formats(formats)

        title = self._og_search_title(webpage).strip()
        description = self._og_search_description(webpage).strip()
--- a/youtube_dl/extractor/amp.py
+++ b/youtube_dl/extractor/amp.py
@@ -69,12 +69,14 @@ class AMPIE(InfoExtractor):

        self._sort_formats(formats)

+        timestamp = parse_iso8601(item.get('pubDate'), ' ') or parse_iso8601(item.get('dc-date'))
+
        return {
            'id': video_id,
            'title': get_media_node('title'),
            'description': get_media_node('description'),
            'thumbnails': thumbnails,
-            'timestamp': parse_iso8601(item.get('pubDate'), ' '),
+            'timestamp': timestamp,
            'duration': int_or_none(media_content[0].get('@attributes', {}).get('duration')),
            'subtitles': subtitles,
            'formats': formats,
--- a/youtube_dl/extractor/azubu.py
+++ b/youtube_dl/extractor/azubu.py
@@ -120,6 +120,7 @@ class AzubuLiveIE(InfoExtractor):
        bc_info = self._download_json(req, user)
        m3u8_url = next(source['src'] for source in bc_info['sources'] if source['container'] == 'M2TS')
        formats = self._extract_m3u8_formats(m3u8_url, user, ext='mp4')
+        self._sort_formats(formats)

        return {
            'id': info['id'],
--- a/youtube_dl/extractor/bbc.py
+++ b/youtube_dl/extractor/bbc.py
@@ -688,6 +688,10 @@ class BBCIE(BBCCoUkIE):
        # custom redirection to www.bbc.com
        'url': 'http://www.bbc.co.uk/news/science-environment-33661876',
        'only_matching': True,
+    }, {
+        # single video article embedded with data-media-vpid
+        'url': 'http://www.bbc.co.uk/sport/rowing/35908187',
+        'only_matching': True,
    }]

    @classmethod
@@ -817,7 +821,7 @@ class BBCIE(BBCCoUkIE):

        # single video story (e.g. http://www.bbc.com/travel/story/20150625-sri-lankas-spicy-secret)
        programme_id = self._search_regex(
-            [r'data-video-player-vpid="(%s)"' % self._ID_REGEX,
+            [r'data-(?:video-player|media)-vpid="(%s)"' % self._ID_REGEX,
             r'<param[^>]+name="externalIdentifier"[^>]+value="(%s)"' % self._ID_REGEX,
             r'videoId\s*:\s*["\'](%s)["\']' % self._ID_REGEX],
            webpage, 'vpid', default=None)
--- a/youtube_dl/extractor/beeg.py
+++ b/youtube_dl/extractor/beeg.py
@@ -34,7 +34,7 @@ class BeegIE(InfoExtractor):
        video_id = self._match_id(url)

        video = self._download_json(
-            'https://api.beeg.com/api/v5/video/%s' % video_id, video_id)
+            'https://api.beeg.com/api/v6/1738/video/%s' % video_id, video_id)

        def split(o, e):
            def cut(s, x):
@@ -50,8 +50,8 @@ class BeegIE(InfoExtractor):
            return n

        def decrypt_key(key):
-            # Reverse engineered from http://static.beeg.com/cpl/1105.js
-            a = '5ShMcIQlssOd7zChAIOlmeTZDaUxULbJRnywYaiB'
+            # Reverse engineered from http://static.beeg.com/cpl/1738.js
+            a = 'GUuyodcfS8FW8gQp4OKLMsZBcX0T7B'
            e = compat_urllib_parse_unquote(key)
            o = ''.join([
                compat_chr(compat_ord(e[n]) - compat_ord(a[n % len(a)]) % 21)
--- a/youtube_dl/extractor/bet.py
+++ b/youtube_dl/extractor/bet.py
@@ -94,6 +94,7 @@ class BetIE(InfoExtractor):
            xpath_with_ns('./media:thumbnail', NS_MAP)).get('url')

        formats = self._extract_smil_formats(smil_url, display_id)
+        self._sort_formats(formats)

        return {
            'id': video_id,
--- a/youtube_dl/extractor/brightcove.py
+++ b/youtube_dl/extractor/brightcove.py
@@ -136,13 +136,16 @@ class BrightcoveLegacyIE(InfoExtractor):
        else:
            flashvars = {}

+        data_url = object_doc.attrib.get('data', '')
+        data_url_params = compat_parse_qs(compat_urllib_parse_urlparse(data_url).query)
+
        def find_param(name):
            if name in flashvars:
                return flashvars[name]
            node = find_xpath_attr(object_doc, './param', 'name', name)
            if node is not None:
                return node.attrib['value']
-            return None
+            return data_url_params.get(name)

        params = {}

@@ -294,7 +297,7 @@ class BrightcoveLegacyIE(InfoExtractor):
            'uploader': video_info.get('publisherName'),
        }

-        renditions = video_info.get('renditions')
+        renditions = video_info.get('renditions', []) + video_info.get('IOSRenditions', [])
        if renditions:
            formats = []
            for rend in renditions:
@@ -316,13 +319,23 @@ class BrightcoveLegacyIE(InfoExtractor):
                if ext is None:
                    ext = determine_ext(url)
                size = rend.get('size')
-                formats.append({
+                a_format = {
                    'url': url,
                    'ext': ext,
                    'height': rend.get('frameHeight'),
                    'width': rend.get('frameWidth'),
                    'filesize': size if size != 0 else None,
-                })
+                }
+
+                # m3u8 manifests with remote == false are media playlists
+                # Not calling _extract_m3u8_formats here to save network traffic
+                if ext == 'm3u8':
+                    a_format.update({
+                        'ext': 'mp4',
+                        'protocol': 'm3u8',
+                    })
+
+                formats.append(a_format)
            self._sort_formats(formats)
            info['formats'] = formats
        elif video_info.get('FLVFullLengthURL') is not None:
@@ -426,7 +439,7 @@ class BrightcoveNewIE(InfoExtractor):
                    </video>.*?
                    <script[^>]+
                        src=["\'](?:https?:)?//players\.brightcove\.net/
-                        (\d+)/([\da-f-]+)_([^/]+)/index(?:\.min)?\.js
+                        (\d+)/([^/]+)_([^/]+)/index(?:\.min)?\.js
                ''', webpage):
            entries.append(
                'http://players.brightcove.net/%s/%s_%s/index.html?videoId=%s'
--- a/youtube_dl/extractor/cbs.py
+++ b/youtube_dl/extractor/cbs.py
@@ -1,21 +1,24 @@
 from __future__ import unicode_literals

-from .common import InfoExtractor
+from .theplatform import ThePlatformIE
 from ..utils import (
-    sanitized_Request,
-    smuggle_url,
+    xpath_text,
+    xpath_element,
+    int_or_none,
+    ExtractorError,
+    find_xpath_attr,
 )


-class CBSIE(InfoExtractor):
+class CBSIE(ThePlatformIE):
    _VALID_URL = r'https?://(?:www\.)?(?:cbs\.com/shows/[^/]+/(?:video|artist)|colbertlateshow\.com/(?:video|podcasts))/[^/]+/(?P<id>[^/]+)'

    _TESTS = [{
        'url': 'http://www.cbs.com/shows/garth-brooks/video/_u7W953k6la293J7EPTd9oHkSPs6Xn6_/connect-chat-feat-garth-brooks/',
        'info_dict': {
-            'id': '4JUVEwq3wUT7',
+            'id': '_u7W953k6la293J7EPTd9oHkSPs6Xn6_',
            'display_id': 'connect-chat-feat-garth-brooks',
-            'ext': 'flv',
+            'ext': 'mp4',
            'title': 'Connect Chat feat. Garth Brooks',
            'description': 'Connect with country music singer Garth Brooks, as he chats with fans on Wednesday November 27, 2013. Be sure to tune in to Garth Brooks: Live from Las Vegas, Friday November 29, at 9/8c on CBS!',
            'duration': 1495,
@@ -47,22 +50,55 @@ class CBSIE(InfoExtractor):
        'url': 'http://www.colbertlateshow.com/podcasts/dYSwjqPs_X1tvbV_P2FcPWRa_qT6akTC/in-the-bad-room-with-stephen/',
        'only_matching': True,
    }]
+    TP_RELEASE_URL_TEMPLATE = 'http://link.theplatform.com/s/dJ5BDC/%s?manifest=m3u&mbr=true'
+
+    def _parse_smil_subtitles(self, smil, namespace=None, subtitles_lang='en'):
+        closed_caption_e = find_xpath_attr(smil, self._xpath_ns('.//param', namespace), 'name', 'ClosedCaptionURL')
+        return {
+            'en': [{
+                'ext': 'ttml',
+                'url': closed_caption_e.attrib['value'],
+            }]
+        } if closed_caption_e is not None and closed_caption_e.attrib.get('value') else []

    def _real_extract(self, url):
        display_id = self._match_id(url)
-        request = sanitized_Request(url)
-        # Android UA is served with higher quality (720p) streams (see
-        # https://github.com/rg3/youtube-dl/issues/7490)
-        request.add_header('User-Agent', 'Mozilla/5.0 (Linux; Android 4.4; Nexus 5)')
-        webpage = self._download_webpage(request, display_id)
-        real_id = self._search_regex(
-            [r"video\.settings\.pid\s*=\s*'([^']+)';", r"cbsplayer\.pid\s*=\s*'([^']+)';"],
-            webpage, 'real video ID')
-        return {
-            '_type': 'url_transparent',
-            'ie_key': 'ThePlatform',
-            'url': smuggle_url(
-                'http://link.theplatform.com/s/dJ5BDC/%s?mbr=true&manifest=m3u' % real_id,
-                {'force_smil_url': True}),
+        webpage = self._download_webpage(url, display_id)
+        content_id = self._search_regex(
+            [r"video\.settings\.content_id\s*=\s*'([^']+)';", r"cbsplayer\.contentId\s*=\s*'([^']+)';"],
+            webpage, 'content id')
+        items_data = self._download_xml(
+            'http://can.cbs.com/thunder/player/videoPlayerService.php',
+            content_id, query={'partner': 'cbs', 'contentId': content_id})
+        video_data = xpath_element(items_data, './/item')
+        title = xpath_text(video_data, 'videoTitle', 'title', True)
+
+        subtitles = {}
+        formats = []
+        for item in items_data.findall('.//item'):
+            pid = xpath_text(item, 'pid')
+            if not pid:
+                continue
+            try:
+                tp_formats, tp_subtitles = self._extract_theplatform_smil(
+                    self.TP_RELEASE_URL_TEMPLATE % pid, content_id, 'Downloading %s SMIL data' % pid)
+            except ExtractorError:
+                continue
+            formats.extend(tp_formats)
+            subtitles = self._merge_subtitles(subtitles, tp_subtitles)
+        self._sort_formats(formats)
+
+        info = self.get_metadata('dJ5BDC/media/guid/2198311517/%s' % content_id, content_id)
+        info.update({
+            'id': content_id,
            'display_id': display_id,
-        }
+            'title': title,
+            'series': xpath_text(video_data, 'seriesTitle'),
+            'season_number': int_or_none(xpath_text(video_data, 'seasonNumber')),
+            'episode_number': int_or_none(xpath_text(video_data, 'episodeNumber')),
+            'duration': int_or_none(xpath_text(video_data, 'videoLength'), 1000),
+            'thumbnail': xpath_text(video_data, 'previewImageURL'),
+            'formats': formats,
+            'subtitles': subtitles,
+        })
+        return info
--- a/youtube_dl/extractor/cbsnews.py
+++ b/youtube_dl/extractor/cbsnews.py
@@ -122,6 +122,7 @@ class CBSNewsLiveVideoIE(InfoExtractor):
            for entry in f4m_formats:
                # URLs without the extra param induce an 404 error
                entry.update({'extra_param_to_segment_url': hdcore_sign})
+        self._sort_formats(f4m_formats)

        return {
            'id': video_id,
--- a/youtube_dl/extractor/chaturbate.py
+++ b/youtube_dl/extractor/chaturbate.py
@@ -48,6 +48,7 @@ class ChaturbateIE(InfoExtractor):
            raise ExtractorError('Unable to find stream URL')

        formats = self._extract_m3u8_formats(m3u8_url, video_id, ext='mp4')
+        self._sort_formats(formats)

        return {
            'id': video_id,
--- a/youtube_dl/extractor/cnbc.py
+++ b/youtube_dl/extractor/cnbc.py
@@ -0,0 +1,33 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import smuggle_url
+
+
+class CNBCIE(InfoExtractor):
+    _VALID_URL = r'https?://video\.cnbc\.com/gallery/\?video=(?P<id>[0-9]+)'
+    _TEST = {
+        'url': 'http://video.cnbc.com/gallery/?video=3000503714',
+        'info_dict': {
+            'id': '3000503714',
+            'ext': 'mp4',
+            'title': 'Fighting zombies is big business',
+            'description': 'md5:0c100d8e1a7947bd2feec9a5550e519e',
+        },
+        'params': {
+            # m3u8 download
+            'skip_download': True,
+        },
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        return {
+            '_type': 'url_transparent',
+            'ie_key': 'ThePlatform',
+            'url': smuggle_url(
+                'http://link.theplatform.com/s/gZWlPC/media/guid/2408950221/%s?mbr=true&manifest=m3u' % video_id,
+                {'force_smil_url': True}),
+            'id': video_id,
+        }
--- a/youtube_dl/extractor/comcarcoff.py
+++ b/youtube_dl/extractor/comcarcoff.py
@@ -41,7 +41,13 @@ class ComCarCoffIE(InfoExtractor):

        display_id = full_data['activeVideo']['video']
        video_data = full_data.get('videos', {}).get(display_id) or full_data['singleshots'][display_id]
+
        video_id = compat_str(video_data['mediaId'])
+        title = video_data['title']
+        formats = self._extract_m3u8_formats(
+            video_data['mediaUrl'], video_id, 'mp4')
+        self._sort_formats(formats)
+
        thumbnails = [{
            'url': video_data['images']['thumb'],
        }, {
@@ -54,15 +60,14 @@ class ComCarCoffIE(InfoExtractor):
            video_data.get('duration'))

        return {
-            '_type': 'url_transparent',
-            'url': 'crackle:%s' % video_id,
            'id': video_id,
            'display_id': display_id,
-            'title': video_data['title'],
+            'title': title,
            'description': video_data.get('description'),
            'timestamp': timestamp,
            'duration': duration,
            'thumbnails': thumbnails,
+            'formats': formats,
            'season_number': int_or_none(video_data.get('season')),
            'episode_number': int_or_none(video_data.get('episode')),
            'webpage_url': 'http://comediansincarsgettingcoffee.com/%s' % (video_data.get('urlSlug', video_data.get('slug'))),
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@@ -22,8 +22,10 @@ from ..compat import (
    compat_str,
    compat_urllib_error,
    compat_urllib_parse_urlencode,
+    compat_urllib_request,
    compat_urlparse,
 )
+from ..downloader.f4m import remove_encrypted_media
 from ..utils import (
    NO_DEFAULT,
    age_restricted,
@@ -48,6 +50,7 @@ from ..utils import (
    determine_protocol,
    parse_duration,
    mimetype2ext,
+    update_Request,
    update_url_query,
 )

@@ -346,7 +349,7 @@ class InfoExtractor(object):
    def IE_NAME(self):
        return compat_str(type(self).__name__[:-2])

-    def _request_webpage(self, url_or_request, video_id, note=None, errnote=None, fatal=True, data=None, headers=None, query=None):
+    def _request_webpage(self, url_or_request, video_id, note=None, errnote=None, fatal=True, data=None, headers={}, query={}):
        """ Returns the response handle """
        if note is None:
            self.report_download_webpage(video_id)
@@ -356,11 +359,14 @@ class InfoExtractor(object):
            else:
                self.to_screen('%s: %s' % (video_id, note))
        # data, headers and query params will be ignored for `Request` objects
-        if isinstance(url_or_request, compat_str):
+        if isinstance(url_or_request, compat_urllib_request.Request):
+            url_or_request = update_Request(
+                url_or_request, data=data, headers=headers, query=query)
+        else:
            if query:
                url_or_request = update_url_query(url_or_request, query)
            if data or headers:
-                url_or_request = sanitized_Request(url_or_request, data, headers or {})
+                url_or_request = sanitized_Request(url_or_request, data, headers)
        try:
            return self._downloader.urlopen(url_or_request)
        except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
@@ -376,7 +382,7 @@ class InfoExtractor(object):
                self._downloader.report_warning(errmsg)
                return False

-    def _download_webpage_handle(self, url_or_request, video_id, note=None, errnote=None, fatal=True, encoding=None, data=None, headers=None, query=None):
+    def _download_webpage_handle(self, url_or_request, video_id, note=None, errnote=None, fatal=True, encoding=None, data=None, headers={}, query={}):
        """ Returns a tuple (page content as string, URL handle) """
        # Strip hashes from the URL (#1038)
        if isinstance(url_or_request, (compat_str, str)):
@@ -469,7 +475,7 @@ class InfoExtractor(object):

        return content

-    def _download_webpage(self, url_or_request, video_id, note=None, errnote=None, fatal=True, tries=1, timeout=5, encoding=None, data=None, headers=None, query=None):
+    def _download_webpage(self, url_or_request, video_id, note=None, errnote=None, fatal=True, tries=1, timeout=5, encoding=None, data=None, headers={}, query={}):
        """ Returns the data of the page as a string """
        success = False
        try_count = 0
@@ -490,7 +496,7 @@ class InfoExtractor(object):

    def _download_xml(self, url_or_request, video_id,
                      note='Downloading XML', errnote='Unable to download XML',
-                      transform_source=None, fatal=True, encoding=None, data=None, headers=None, query=None):
+                      transform_source=None, fatal=True, encoding=None, data=None, headers={}, query={}):
        """Return the xml as an xml.etree.ElementTree.Element"""
        xml_string = self._download_webpage(
            url_or_request, video_id, note, errnote, fatal=fatal, encoding=encoding, data=data, headers=headers, query=query)
@@ -504,7 +510,7 @@ class InfoExtractor(object):
                       note='Downloading JSON metadata',
                       errnote='Unable to download JSON metadata',
                       transform_source=None,
-                       fatal=True, encoding=None, data=None, headers=None, query=None):
+                       fatal=True, encoding=None, data=None, headers={}, query={}):
        json_string = self._download_webpage(
            url_or_request, video_id, note, errnote, fatal=fatal,
            encoding=encoding, data=data, headers=headers, query=query)
@@ -989,6 +995,11 @@ class InfoExtractor(object):
        if not media_nodes:
            manifest_version = '2.0'
            media_nodes = manifest.findall('{http://ns.adobe.com/f4m/2.0}media')
+        # Remove unsupported DRM protected media from final formats
+        # rendition (see https://github.com/rg3/youtube-dl/issues/8573).
+        media_nodes = remove_encrypted_media(media_nodes)
+        if not media_nodes:
+            return formats
        base_url = xpath_text(
            manifest, ['{http://ns.adobe.com/f4m/1.0}baseURL', '{http://ns.adobe.com/f4m/2.0}baseURL'],
            'base URL', default=None)
@@ -1021,8 +1032,6 @@ class InfoExtractor(object):
                'height': int_or_none(media_el.attrib.get('height')),
                'preference': preference,
            })
-        self._sort_formats(formats)
-
        return formats

    def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
@@ -1143,7 +1152,6 @@ class InfoExtractor(object):
                    last_media = None
                formats.append(f)
                last_info = {}
-        self._sort_formats(formats)
        return formats

    @staticmethod
@@ -1317,8 +1325,6 @@ class InfoExtractor(object):
                })
                continue

-        self._sort_formats(formats)
-
        return formats

    def _parse_smil_subtitles(self, smil, namespace=None, subtitles_lang='en'):
@@ -1536,7 +1542,6 @@ class InfoExtractor(object):
                            existing_format.update(f)
                    else:
                        self.report_warning('Unknown MIME type %s in DASH manifest' % mime_type)
-        self._sort_formats(formats)
        return formats

    def _live_title(self, name):
--- a/youtube_dl/extractor/cwtv.py
+++ b/youtube_dl/extractor/cwtv.py
@@ -57,6 +57,7 @@ class CWTVIE(InfoExtractor):

        formats = self._extract_m3u8_formats(
            video_data['videos']['variantplaylist']['uri'], video_id, 'mp4')
+        self._sort_formats(formats)

        thumbnails = [{
            'url': image['uri'],
--- a/youtube_dl/extractor/dfb.py
+++ b/youtube_dl/extractor/dfb.py
@@ -38,6 +38,7 @@ class DFBIE(InfoExtractor):
        token_el = f4m_info.find('token')
        manifest_url = token_el.attrib['url'] + '?' + 'hdnea=' + token_el.attrib['auth'] + '&hdcore=3.2.0'
        formats = self._extract_f4m_formats(manifest_url, display_id)
+        self._sort_formats(formats)

        return {
            'id': video_id,
--- a/youtube_dl/extractor/discovery.py
+++ b/youtube_dl/extractor/discovery.py
@@ -63,18 +63,23 @@ class DiscoveryIE(InfoExtractor):

        video_title = info.get('playlist_title') or info.get('video_title')

-        entries = [{
-            'id': compat_str(video_info['id']),
-            'formats': self._extract_m3u8_formats(
+        entries = []
+
+        for idx, video_info in enumerate(info['playlist']):
+            formats = self._extract_m3u8_formats(
                video_info['src'], display_id, 'mp4', 'm3u8_native', m3u8_id='hls',
-                note='Download m3u8 information for video %d' % (idx + 1)),
-            'title': video_info['title'],
-            'description': video_info.get('description'),
-            'duration': parse_duration(video_info.get('video_length')),
-            'webpage_url': video_info.get('href') or video_info.get('url'),
-            'thumbnail': video_info.get('thumbnailURL'),
-            'alt_title': video_info.get('secondary_title'),
-            'timestamp': parse_iso8601(video_info.get('publishedDate')),
-        } for idx, video_info in enumerate(info['playlist'])]
+                note='Download m3u8 information for video %d' % (idx + 1))
+            self._sort_formats(formats)
+            entries.append({
+                'id': compat_str(video_info['id']),
+                'formats': formats,
+                'title': video_info['title'],
+                'description': video_info.get('description'),
+                'duration': parse_duration(video_info.get('video_length')),
+                'webpage_url': video_info.get('href') or video_info.get('url'),
+                'thumbnail': video_info.get('thumbnailURL'),
+                'alt_title': video_info.get('secondary_title'),
+                'timestamp': parse_iso8601(video_info.get('publishedDate')),
+            })

        return self.playlist_result(entries, display_id, video_title)
--- a/youtube_dl/extractor/dplay.py
+++ b/youtube_dl/extractor/dplay.py
@@ -118,6 +118,8 @@ class DPlayIE(InfoExtractor):
                if info.get(protocol):
                    extract_formats(protocol, info[protocol])

+        self._sort_formats(formats)
+
        return {
            'id': video_id,
            'display_id': display_id,
--- a/youtube_dl/extractor/dw.py
+++ b/youtube_dl/extractor/dw.py
@@ -39,13 +39,13 @@ class DWIE(InfoExtractor):
        hidden_inputs = self._hidden_inputs(webpage)
        title = hidden_inputs['media_title']

-        formats = []
        if hidden_inputs.get('player_type') == 'video' and hidden_inputs.get('stream_file') == '1':
            formats = self._extract_smil_formats(
                'http://www.dw.com/smil/v-%s' % media_id, media_id,
                transform_source=lambda s: s.replace(
                    'rtmp://tv-od.dw.de/flash/',
                    'http://tv-download.dw.de/dwtv_video/flv/'))
+            self._sort_formats(formats)
        else:
            formats = [{'url': hidden_inputs['file_name']}]

--- a/youtube_dl/extractor/foxnews.py
+++ b/youtube_dl/extractor/foxnews.py
@@ -18,8 +18,8 @@ class FoxNewsIE(AMPIE):
                'title': 'Frozen in Time',
                'description': '16-year-old girl is size of toddler',
                'duration': 265,
-                # 'timestamp': 1304411491,
-                # 'upload_date': '20110503',
+                'timestamp': 1304411491,
+                'upload_date': '20110503',
                'thumbnail': 're:^https?://.*\.jpg$',
            },
        },
@@ -32,8 +32,8 @@ class FoxNewsIE(AMPIE):
                'title': "Rep. Luis Gutierrez on if Obama's immigration plan is legal",
                'description': "Congressman discusses president's plan",
                'duration': 292,
-                # 'timestamp': 1417662047,
-                # 'upload_date': '20141204',
+                'timestamp': 1417662047,
+                'upload_date': '20141204',
                'thumbnail': 're:^https?://.*\.jpg$',
            },
            'params': {
--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dl/extractor/generic.py
@@ -406,19 +406,6 @@ class GenericIE(InfoExtractor):
                'skip_download': True,
            },
        },
-        # multiple ooyala embeds on SBN network websites
-        {
-            'url': 'http://www.sbnation.com/college-football-recruiting/2015/2/3/7970291/national-signing-day-rationalizations-itll-be-ok-itll-be-ok',
-            'info_dict': {
-                'id': 'national-signing-day-rationalizations-itll-be-ok-itll-be-ok',
-                'title': '25 lies you will tell yourself on National Signing Day - SBNation.com',
-            },
-            'playlist_mincount': 3,
-            'params': {
-                'skip_download': True,
-            },
-            'add_ie': ['Ooyala'],
-        },
        # embed.ly video
        {
            'url': 'http://www.tested.com/science/weird/460206-tested-grinding-coffee-2000-frames-second/',
@@ -1124,7 +1111,23 @@ class GenericIE(InfoExtractor):
                # m3u8 downloads
                'skip_download': True,
            }
-        }
+        },
+        # Brightcove embed, with no valid 'renditions' but valid 'IOSRenditions'
+        # This video can't be played in browsers if Flash disabled and UA set to iPhone, which is actually a false alarm
+        {
+            'url': 'https://dl.dropboxusercontent.com/u/29092637/interview.html',
+            'info_dict': {
+                'id': '4785848093001',
+                'ext': 'mp4',
+                'title': 'The Cardinal Pell Interview',
+                'description': 'Sky News Contributor Andrew Bolt interviews George Pell in Rome, following the Cardinal\'s evidence before the Royal Commission into Child Abuse. ',
+                'uploader': 'GlobeCast Australia - GlobeStream',
+            },
+            'params': {
+                # m3u8 downloads
+                'skip_download': True,
+            },
+        },
    ]

    def report_following_redirect(self, new_url):
@@ -1294,6 +1297,7 @@ class GenericIE(InfoExtractor):
                    'vcodec': 'none' if m.group('type') == 'audio' else None
                }]
                info_dict['direct'] = True
+            self._sort_formats(formats)
            info_dict['formats'] = formats
            return info_dict

@@ -1320,6 +1324,7 @@ class GenericIE(InfoExtractor):
        # Is it an M3U playlist?
        if first_bytes.startswith(b'#EXTM3U'):
            info_dict['formats'] = self._extract_m3u8_formats(url, video_id, 'mp4')
+            self._sort_formats(info_dict['formats'])
            return info_dict

        # Maybe it's a direct link to a video?
@@ -1344,15 +1349,19 @@ class GenericIE(InfoExtractor):
            if doc.tag == 'rss':
                return self._extract_rss(url, video_id, doc)
            elif re.match(r'^(?:{[^}]+})?smil$', doc.tag):
-                return self._parse_smil(doc, url, video_id)
+                smil = self._parse_smil(doc, url, video_id)
+                self._sort_formats(smil['formats'])
+                return smil
            elif doc.tag == '{http://xspf.org/ns/0/}playlist':
                return self.playlist_result(self._parse_xspf(doc, video_id), video_id)
            elif re.match(r'(?i)^(?:{[^}]+})?MPD$', doc.tag):
                info_dict['formats'] = self._parse_mpd_formats(
                    doc, video_id, mpd_base_url=url.rpartition('/')[0])
+                self._sort_formats(info_dict['formats'])
                return info_dict
            elif re.match(r'^{http://ns\.adobe\.com/f4m/[12]\.0}manifest$', doc.tag):
                info_dict['formats'] = self._parse_f4m_formats(doc, url, video_id)
+                self._sort_formats(info_dict['formats'])
                return info_dict
        except compat_xml_parse_error:
            pass
@@ -2037,6 +2046,9 @@ class GenericIE(InfoExtractor):
            else:
                entry_info_dict['url'] = video_url

+            if entry_info_dict.get('formats'):
+                self._sort_formats(entry_info_dict['formats'])
+
            entries.append(entry_info_dict)

        if len(entries) == 1:
--- a/youtube_dl/extractor/howstuffworks.py
+++ b/youtube_dl/extractor/howstuffworks.py
@@ -6,6 +6,7 @@ from ..utils import (
    int_or_none,
    js_to_json,
    unescapeHTML,
+    determine_ext,
 )


@@ -39,7 +40,7 @@ class HowStuffWorksIE(InfoExtractor):
            'url': 'http://entertainment.howstuffworks.com/arts/2706-sword-swallowing-1-by-dan-meyer-video.htm',
            'info_dict': {
                'id': '440011',
-                'ext': 'flv',
+                'ext': 'mp4',
                'title': 'Sword Swallowing #1 by Dan Meyer',
                'description': 'Video footage (1 of 3) used by permission of the owner Dan Meyer through Sword Swallowers Association International <www.swordswallow.org>',
                'display_id': 'sword-swallowing-1-by-dan-meyer',
@@ -63,13 +64,19 @@ class HowStuffWorksIE(InfoExtractor):
        video_id = clip_info['content_id']
        formats = []
        m3u8_url = clip_info.get('m3u8')
-        if m3u8_url:
-            formats += self._extract_m3u8_formats(m3u8_url, video_id, 'mp4')
+        if m3u8_url and determine_ext(m3u8_url) == 'm3u8':
+            formats.extend(self._extract_m3u8_formats(m3u8_url, video_id, 'mp4', format_id='hls', fatal=True))
+        flv_url = clip_info.get('flv_url')
+        if flv_url:
+            formats.append({
+                'url': flv_url,
+                'format_id': 'flv',
+            })
        for video in clip_info.get('mp4', []):
            formats.append({
                'url': video['src'],
-                'format_id': video['bitrate'],
-                'vbr': int(video['bitrate'].rstrip('k')),
+                'format_id': 'mp4-%s' % video['bitrate'],
+                'vbr': int_or_none(video['bitrate'].rstrip('k')),
            })

        if not formats:
@@ -102,6 +109,6 @@ class HowStuffWorksIE(InfoExtractor):
            'title': unescapeHTML(clip_info['clip_title']),
            'description': unescapeHTML(clip_info.get('caption')),
            'thumbnail': clip_info.get('video_still_url'),
-            'duration': clip_info.get('duration'),
+            'duration': int_or_none(clip_info.get('duration')),
            'formats': formats,
        }
--- a/youtube_dl/extractor/kuwo.py
+++ b/youtube_dl/extractor/kuwo.py
@@ -26,10 +26,23 @@ class KuwoBaseIE(InfoExtractor):
    def _get_formats(self, song_id, tolerate_ip_deny=False):
        formats = []
        for file_format in self._FORMATS:
+            headers = {}
+            cn_verification_proxy = self._downloader.params.get('cn_verification_proxy')
+            if cn_verification_proxy:
+                headers['Ytdl-request-proxy'] = cn_verification_proxy
+
+            query = {
+                'format': file_format['ext'],
+                'br': file_format.get('br', ''),
+                'rid': 'MUSIC_%s' % song_id,
+                'type': 'convert_url',
+                'response': 'url'
+            }
+
            song_url = self._download_webpage(
-                'http://antiserver.kuwo.cn/anti.s?format=%s&br=%s&rid=MUSIC_%s&type=convert_url&response=url' %
-                (file_format['ext'], file_format.get('br', ''), song_id),
+                'http://antiserver.kuwo.cn/anti.s',
                song_id, note='Download %s url info' % file_format['format'],
+                query=query, headers=headers,
            )

            if song_url == 'IPDeny' and not tolerate_ip_deny:
@@ -44,18 +57,13 @@ class KuwoBaseIE(InfoExtractor):
                    'abr': file_format.get('abr'),
                })

-        # XXX _sort_formats fails if there are not formats, while it's not the
-        # desired behavior if 'IPDeny' is ignored
-        # This check can be removed if https://github.com/rg3/youtube-dl/pull/8051 is merged
-        if not tolerate_ip_deny:
-            self._sort_formats(formats)
        return formats


 class KuwoIE(KuwoBaseIE):
    IE_NAME = 'kuwo:song'
    IE_DESC = '酷我音乐'
-    _VALID_URL = r'https?://www\.kuwo\.cn/yinyue/(?P<id>\d+?)'
+    _VALID_URL = r'https?://www\.kuwo\.cn/yinyue/(?P<id>\d+)'
    _TESTS = [{
        'url': 'http://www.kuwo.cn/yinyue/635632/',
        'info_dict': {
@@ -103,6 +111,7 @@ class KuwoIE(KuwoBaseIE):
            lrc_content = None

        formats = self._get_formats(song_id)
+        self._sort_formats(formats)

        album_id = self._html_search_regex(
            r'<p[^>]+class="album"[^<]+<a[^>]+href="http://www\.kuwo\.cn/album/(\d+)/"',
--- a/youtube_dl/extractor/laola1tv.py
+++ b/youtube_dl/extractor/laola1tv.py
@@ -130,6 +130,7 @@ class Laola1TvIE(InfoExtractor):
        formats = self._extract_f4m_formats(
            '%s?hdnea=%s&hdcore=3.2.0' % (token_attrib['url'], token_auth),
            video_id, f4m_id='hds')
+        self._sort_formats(formats)

        categories_str = _v('meta_sports')
        categories = categories_str.split(',') if categories_str else []
--- a/youtube_dl/extractor/lrt.py
+++ b/youtube_dl/extractor/lrt.py
@@ -37,6 +37,7 @@ class LRTIE(InfoExtractor):
            r'file\s*:\s*(["\'])(?P<url>.+?)\1\s*\+\s*location\.hash\.substring\(1\)',
            webpage, 'm3u8 url', group='url')
        formats = self._extract_m3u8_formats(m3u8_url, video_id, 'mp4')
+        self._sort_formats(formats)

        thumbnail = self._og_search_thumbnail(webpage)
        description = self._og_search_description(webpage)
--- a/youtube_dl/extractor/lynda.py
+++ b/youtube_dl/extractor/lynda.py
@@ -219,7 +219,7 @@ class LyndaCourseIE(LyndaBaseIE):
                'Course %s does not exist' % course_id, expected=True)

        unaccessible_videos = 0
-        videos = []
+        entries = []

        # Might want to extract videos right here from video['Formats'] as it seems 'Formats' is not provided
        # by single video API anymore
@@ -229,20 +229,22 @@ class LyndaCourseIE(LyndaBaseIE):
                if video.get('HasAccess') is False:
                    unaccessible_videos += 1
                    continue
-                if video.get('ID'):
-                    videos.append(video['ID'])
+                video_id = video.get('ID')
+                if video_id:
+                    entries.append({
+                        '_type': 'url_transparent',
+                        'url': 'http://www.lynda.com/%s/%s-4.html' % (course_path, video_id),
+                        'ie_key': LyndaIE.ie_key(),
+                        'chapter': chapter.get('Title'),
+                        'chapter_number': int_or_none(chapter.get('ChapterIndex')),
+                        'chapter_id': compat_str(chapter.get('ID')),
+                    })

        if unaccessible_videos > 0:
            self._downloader.report_warning(
                '%s videos are only available for members (or paid members) and will not be downloaded. '
                % unaccessible_videos + self._ACCOUNT_CREDENTIALS_HINT)

-        entries = [
-            self.url_result(
-                'http://www.lynda.com/%s/%s-4.html' % (course_path, video_id),
-                'Lynda')
-            for video_id in videos]
-
        course_title = course.get('Title')

        return self.playlist_result(entries, course_id, course_title)
--- a/youtube_dl/extractor/mailru.py
+++ b/youtube_dl/extractor/mailru.py
@@ -13,7 +13,7 @@ from ..utils import (
 class MailRuIE(InfoExtractor):
    IE_NAME = 'mailru'
    IE_DESC = 'Видео@Mail.Ru'
-    _VALID_URL = r'https?://(?:www\.)?my\.mail\.ru/(?:video/.*#video=/?(?P<idv1>(?:[^/]+/){3}\d+)|(?:(?P<idv2prefix>(?:[^/]+/){2})video/(?P<idv2suffix>[^/]+/\d+))\.html)'
+    _VALID_URL = r'https?://(?:(?:www|m)\.)?my\.mail\.ru/(?:video/.*#video=/?(?P<idv1>(?:[^/]+/){3}\d+)|(?:(?P<idv2prefix>(?:[^/]+/){2})video/(?P<idv2suffix>[^/]+/\d+))\.html)'

    _TESTS = [
        {
@@ -61,6 +61,10 @@ class MailRuIE(InfoExtractor):
                'duration': 6001,
            },
            'skip': 'Not accessible from Travis CI server',
+        },
+        {
+            'url': 'http://m.my.mail.ru/mail/3sktvtr/video/_myvideo/138.html',
+            'only_matching': True,
        }
    ]

--- a/youtube_dl/extractor/matchtv.py
+++ b/youtube_dl/extractor/matchtv.py
@@ -47,6 +47,7 @@ class MatchTVIE(InfoExtractor):
        video_url = self._download_json(request, video_id)['data']['videoUrl']
        f4m_url = xpath_text(self._download_xml(video_url, video_id), './to')
        formats = self._extract_f4m_formats(f4m_url, video_id)
+        self._sort_formats(formats)
        return {
            'id': video_id,
            'title': self._live_title('Матч ТВ - Прямой эфир'),
--- a/youtube_dl/extractor/mitele.py
+++ b/youtube_dl/extractor/mitele.py
@@ -67,6 +67,7 @@ class MiTeleIE(InfoExtractor):
            formats.extend(self._extract_f4m_formats(
                file_ + '&hdcore=3.2.0&plugin=aasp-3.2.0.77.18',
                display_id, f4m_id=loc))
+        self._sort_formats(formats)

        title = self._search_regex(
            r'class="Destacado-text"[^>]*>\s*<strong>([^<]+)</strong>', webpage, 'title')
--- a/youtube_dl/extractor/myspace.py
+++ b/youtube_dl/extractor/myspace.py
@@ -2,13 +2,13 @@
 from __future__ import unicode_literals

 import re
-import json

 from .common import InfoExtractor
-from ..compat import (
-    compat_str,
+from ..utils import (
+    ExtractorError,
+    int_or_none,
+    parse_iso8601,
 )
-from ..utils import ExtractorError


 class MySpaceIE(InfoExtractor):
@@ -24,6 +24,8 @@ class MySpaceIE(InfoExtractor):
                'description': 'This country quartet was all smiles while playing a sold out show at the Pacific Amphitheatre in Orange County, California.',
                'uploader': 'Five Minutes to the Stage',
                'uploader_id': 'fiveminutestothestage',
+                'timestamp': 1414108751,
+                'upload_date': '20141023',
            },
            'params': {
                # rtmp download
@@ -64,7 +66,7 @@ class MySpaceIE(InfoExtractor):
                'ext': 'mp4',
                'title': 'Starset - First Light',
                'description': 'md5:2d5db6c9d11d527683bcda818d332414',
-                'uploader': 'Jacob Soren',
+                'uploader': 'Yumi K',
                'uploader_id': 'SorenPromotions',
                'upload_date': '20140725',
            }
@@ -78,6 +80,19 @@ class MySpaceIE(InfoExtractor):
        player_url = self._search_regex(
            r'playerSwf":"([^"?]*)', webpage, 'player URL')

+        def rtmp_format_from_stream_url(stream_url, width=None, height=None):
+            rtmp_url, play_path = stream_url.split(';', 1)
+            return {
+                'format_id': 'rtmp',
+                'url': rtmp_url,
+                'play_path': play_path,
+                'player_url': player_url,
+                'protocol': 'rtmp',
+                'ext': 'flv',
+                'width': width,
+                'height': height,
+            }
+
        if mobj.group('mediatype').startswith('music/song'):
            # songs don't store any useful info in the 'context' variable
            song_data = self._search_regex(
@@ -93,8 +108,8 @@ class MySpaceIE(InfoExtractor):
                return self._search_regex(
                    r'''data-%s=([\'"])(?P<data>.*?)\1''' % name,
                    song_data, name, default='', group='data')
-            streamUrl = search_data('stream-url')
-            if not streamUrl:
+            stream_url = search_data('stream-url')
+            if not stream_url:
                vevo_id = search_data('vevo-id')
                youtube_id = search_data('youtube-id')
                if vevo_id:
@@ -106,36 +121,47 @@ class MySpaceIE(InfoExtractor):
                else:
                    raise ExtractorError(
                        'Found song but don\'t know how to download it')
-            info = {
+            return {
                'id': video_id,
                'title': self._og_search_title(webpage),
                'uploader': search_data('artist-name'),
                'uploader_id': search_data('artist-username'),
                'thumbnail': self._og_search_thumbnail(webpage),
+                'duration': int_or_none(search_data('duration')),
+                'formats': [rtmp_format_from_stream_url(stream_url)]
            }
        else:
-            context = json.loads(self._search_regex(
-                r'context = ({.*?});', webpage, 'context'))
-            video = context['video']
-            streamUrl = video['streamUrl']
-            info = {
-                'id': compat_str(video['mediaId']),
+            video = self._parse_json(self._search_regex(
+                r'context = ({.*?});', webpage, 'context'),
+                video_id)['video']
+            formats = []
+            hls_stream_url = video.get('hlsStreamUrl')
+            if hls_stream_url:
+                formats.append({
+                    'format_id': 'hls',
+                    'url': hls_stream_url,
+                    'protocol': 'm3u8_native',
+                    'ext': 'mp4',
+                })
+            stream_url = video.get('streamUrl')
+            if stream_url:
+                formats.append(rtmp_format_from_stream_url(
+                    stream_url,
+                    int_or_none(video.get('width')),
+                    int_or_none(video.get('height'))))
+            self._sort_formats(formats)
+            return {
+                'id': video_id,
                'title': video['title'],
-                'description': video['description'],
-                'thumbnail': video['imageUrl'],
-                'uploader': video['artistName'],
-                'uploader_id': video['artistUsername'],
+                'description': video.get('description'),
+                'thumbnail': video.get('imageUrl'),
+                'uploader': video.get('artistName'),
+                'uploader_id': video.get('artistUsername'),
+                'duration': int_or_none(video.get('duration')),
+                'timestamp': parse_iso8601(video.get('dateAdded')),
+                'formats': formats,
            }

-        rtmp_url, play_path = streamUrl.split(';', 1)
-        info.update({
-            'url': rtmp_url,
-            'play_path': play_path,
-            'player_url': player_url,
-            'ext': 'flv',
-        })
-        return info
-

 class MySpaceAlbumIE(InfoExtractor):
    IE_NAME = 'MySpace:album'
--- a/youtube_dl/extractor/nationalgeographic.py
+++ b/youtube_dl/extractor/nationalgeographic.py
@@ -4,18 +4,21 @@ from .common import InfoExtractor
 from ..utils import (
    smuggle_url,
    url_basename,
+    update_url_query,
 )


 class NationalGeographicIE(InfoExtractor):
+    IE_NAME = 'natgeo'
    _VALID_URL = r'https?://video\.nationalgeographic\.com/.*?'

    _TESTS = [
        {
            'url': 'http://video.nationalgeographic.com/video/news/150210-news-crab-mating-vin?source=featuredvideo',
+            'md5': '730855d559abbad6b42c2be1fa584917',
            'info_dict': {
-                'id': '4DmDACA6Qtk_',
-                'ext': 'flv',
+                'id': '0000014b-70a1-dd8c-af7f-f7b559330001',
+                'ext': 'mp4',
                'title': 'Mating Crabs Busted by Sharks',
                'description': 'md5:16f25aeffdeba55aaa8ec37e093ad8b3',
            },
@@ -23,9 +26,10 @@ class NationalGeographicIE(InfoExtractor):
        },
        {
            'url': 'http://video.nationalgeographic.com/wild/when-sharks-attack/the-real-jaws',
+            'md5': '6a3105eb448c070503b3105fb9b320b5',
            'info_dict': {
-                'id': '_JeBD_D7PlS5',
-                'ext': 'flv',
+                'id': 'ngc-I0IauNSWznb_UV008GxSbwY35BZvgi2e',
+                'ext': 'mp4',
                'title': 'The Real Jaws',
                'description': 'md5:8d3e09d9d53a85cd397b4b21b2c77be6',
            },
@@ -37,18 +41,61 @@ class NationalGeographicIE(InfoExtractor):
        name = url_basename(url)

        webpage = self._download_webpage(url, name)
-        feed_url = self._search_regex(
-            r'data-feed-url="([^"]+)"', webpage, 'feed url')
        guid = self._search_regex(
            r'id="(?:videoPlayer|player-container)"[^>]+data-guid="([^"]+)"',
            webpage, 'guid')

-        feed = self._download_xml('%s?byGuid=%s' % (feed_url, guid), name)
-        content = feed.find('.//{http://search.yahoo.com/mrss/}content')
-        theplatform_id = url_basename(content.attrib.get('url'))
+        return {
+            '_type': 'url_transparent',
+            'ie_key': 'ThePlatform',
+            'url': smuggle_url(
+                'http://link.theplatform.com/s/ngs/media/guid/2423130747/%s?mbr=true' % guid,
+                {'force_smil_url': True}),
+            'id': guid,
+        }

-        return self.url_result(smuggle_url(
-            'http://link.theplatform.com/s/ngs/%s?formats=MPEG4&manifest=f4m' % theplatform_id,
-            # For some reason, the normal links don't work and we must force
-            # the use of f4m
-            {'force_smil_url': True}))
+
+class NationalGeographicChannelIE(InfoExtractor):
+    IE_NAME = 'natgeo:channel'
+    _VALID_URL = r'https?://channel\.nationalgeographic\.com/(?:wild/)?[^/]+/videos/(?P<id>[^/?]+)'
+
+    _TESTS = [
+        {
+            'url': 'http://channel.nationalgeographic.com/the-story-of-god-with-morgan-freeman/videos/uncovering-a-universal-knowledge/',
+            'md5': '518c9aa655686cf81493af5cc21e2a04',
+            'info_dict': {
+                'id': 'nB5vIAfmyllm',
+                'ext': 'mp4',
+                'title': 'Uncovering a Universal Knowledge',
+                'description': 'md5:1a89148475bf931b3661fcd6ddb2ae3a',
+            },
+            'add_ie': ['ThePlatform'],
+        },
+        {
+            'url': 'http://channel.nationalgeographic.com/wild/destination-wild/videos/the-stunning-red-bird-of-paradise/',
+            'md5': 'c4912f656b4cbe58f3e000c489360989',
+            'info_dict': {
+                'id': '3TmMv9OvGwIR',
+                'ext': 'mp4',
+                'title': 'The Stunning Red Bird of Paradise',
+                'description': 'md5:7bc8cd1da29686be4d17ad1230f0140c',
+            },
+            'add_ie': ['ThePlatform'],
+        },
+    ]
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+        webpage = self._download_webpage(url, display_id)
+        release_url = self._search_regex(
+            r'video_auth_playlist_url\s*=\s*"([^"]+)"',
+            webpage, 'release url')
+
+        return {
+            '_type': 'url_transparent',
+            'ie_key': 'ThePlatform',
+            'url': smuggle_url(
+                update_url_query(release_url, {'mbr': 'true', 'switch': 'http'}),
+                {'force_smil_url': True}),
+            'display_id': display_id,
+        }
--- a/youtube_dl/extractor/nbc.py
+++ b/youtube_dl/extractor/nbc.py
@@ -134,6 +134,30 @@ class NBCSportsIE(InfoExtractor):
            NBCSportsVPlayerIE._extract_url(webpage), 'NBCSportsVPlayer')


+class CSNNEIE(InfoExtractor):
+    _VALID_URL = r'https?://www\.csnne\.com/video/(?P<id>[0-9a-z-]+)'
+
+    _TEST = {
+        'url': 'http://www.csnne.com/video/snc-evening-update-wright-named-red-sox-no-5-starter',
+        'info_dict': {
+            'id': 'yvBLLUgQ8WU0',
+            'ext': 'mp4',
+            'title': 'SNC evening update: Wright named Red Sox\' No. 5 starter.',
+            'description': 'md5:1753cfee40d9352b19b4c9b3e589b9e3',
+        }
+    }
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+        webpage = self._download_webpage(url, display_id)
+        return {
+            '_type': 'url_transparent',
+            'ie_key': 'ThePlatform',
+            'url': self._html_search_meta('twitter:player:stream', webpage),
+            'display_id': display_id,
+        }
+
+
 class NBCNewsIE(ThePlatformIE):
    _VALID_URL = r'''(?x)https?://(?:www\.)?nbcnews\.com/
        (?:video/.+?/(?P<id>\d+)|
--- a/youtube_dl/extractor/nrk.py
+++ b/youtube_dl/extractor/nrk.py
@@ -63,6 +63,7 @@ class NRKIE(InfoExtractor):
        if determine_ext(media_url) == 'f4m':
            formats = self._extract_f4m_formats(
                media_url + '?hdcore=3.5.0&plugin=aasp-3.5.0.151.81', video_id, f4m_id='hds')
+            self._sort_formats(formats)
        else:
            formats = [{
                'url': media_url,
--- a/youtube_dl/extractor/pluralsight.py
+++ b/youtube_dl/extractor/pluralsight.py
@@ -279,13 +279,18 @@ class PluralsightCourseIE(PluralsightBaseIE):
            course_id, 'Downloading course data JSON')

        entries = []
-        for module in course_data:
+        for num, module in enumerate(course_data, 1):
            for clip in module.get('clips', []):
                player_parameters = clip.get('playerParameters')
                if not player_parameters:
                    continue
-                entries.append(self.url_result(
-                    '%s/training/player?%s' % (self._API_BASE, player_parameters),
-                    'Pluralsight'))
+                entries.append({
+                    '_type': 'url_transparent',
+                    'url': '%s/training/player?%s' % (self._API_BASE, player_parameters),
+                    'ie_key': PluralsightIE.ie_key(),
+                    'chapter': module.get('title'),
+                    'chapter_number': num,
+                    'chapter_id': module.get('moduleRef'),
+                })

        return self.playlist_result(entries, course_id, title, description)
--- a/youtube_dl/extractor/pornhub.py
+++ b/youtube_dl/extractor/pornhub.py
@@ -1,10 +1,12 @@
 from __future__ import unicode_literals

+import itertools
 import os
 import re

 from .common import InfoExtractor
 from ..compat import (
+    compat_HTTPError,
    compat_urllib_parse_unquote,
    compat_urllib_parse_unquote_plus,
    compat_urllib_parse_urlparse,
@@ -12,6 +14,7 @@ from ..compat import (
 from ..utils import (
    ExtractorError,
    int_or_none,
+    orderedSet,
    sanitized_Request,
    str_to_int,
 )
@@ -75,7 +78,7 @@ class PornHubIE(InfoExtractor):

        flashvars = self._parse_json(
            self._search_regex(
-                r'var\s+flashv1ars_\d+\s*=\s*({.+?});', webpage, 'flashvars', default='{}'),
+                r'var\s+flashvars_\d+\s*=\s*({.+?});', webpage, 'flashvars', default='{}'),
            video_id)
        if flashvars:
            video_title = flashvars.get('video_title')
@@ -149,9 +152,12 @@ class PornHubIE(InfoExtractor):
 class PornHubPlaylistBaseIE(InfoExtractor):
    def _extract_entries(self, webpage):
        return [
-            self.url_result('http://www.pornhub.com/%s' % video_url, PornHubIE.ie_key())
-            for video_url in set(re.findall(
-                r'href="/?(view_video\.php\?.*\bviewkey=[\da-z]+[^"]*)"', webpage))
+            self.url_result(
+                'http://www.pornhub.com/%s' % video_url,
+                PornHubIE.ie_key(), video_title=title)
+            for video_url, title in orderedSet(re.findall(
+                r'href="/?(view_video\.php\?.*\bviewkey=[\da-z]+[^"]*)"[^>]*\s+title="([^"]+)"',
+                webpage))
        ]

    def _real_extract(self, url):
@@ -185,16 +191,31 @@ class PornHubPlaylistIE(PornHubPlaylistBaseIE):
 class PornHubUserVideosIE(PornHubPlaylistBaseIE):
    _VALID_URL = r'https?://(?:www\.)?pornhub\.com/users/(?P<id>[^/]+)/videos'
    _TESTS = [{
-        'url': 'http://www.pornhub.com/users/rushandlia/videos',
+        'url': 'http://www.pornhub.com/users/zoe_ph/videos/public',
        'info_dict': {
-            'id': 'rushandlia',
+            'id': 'zoe_ph',
        },
-        'playlist_mincount': 13,
+        'playlist_mincount': 171,
+    }, {
+        'url': 'http://www.pornhub.com/users/rushandlia/videos',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
        user_id = self._match_id(url)

-        webpage = self._download_webpage(url, user_id)
+        entries = []
+        for page_num in itertools.count(1):
+            try:
+                webpage = self._download_webpage(
+                    url, user_id, 'Downloading page %d' % page_num,
+                    query={'page': page_num})
+            except ExtractorError as e:
+                if isinstance(e.cause, compat_HTTPError) and e.cause.code == 404:
+                    break
+            page_entries = self._extract_entries(webpage)
+            if not page_entries:
+                break
+            entries.extend(page_entries)

-        return self.playlist_result(self._extract_entries(webpage), user_id)
+        return self.playlist_result(entries, user_id)
--- a/youtube_dl/extractor/restudy.py
+++ b/youtube_dl/extractor/restudy.py
@@ -31,6 +31,7 @@ class RestudyIE(InfoExtractor):
        formats = self._extract_smil_formats(
            'https://www.restudy.dk/awsmedia/SmilDirectory/video_%s.xml' % video_id,
            video_id)
+        self._sort_formats(formats)

        return {
            'id': video_id,
--- a/youtube_dl/extractor/rte.py
+++ b/youtube_dl/extractor/rte.py
@@ -49,6 +49,7 @@ class RteIE(InfoExtractor):
        # f4m_url = server + relative_url
        f4m_url = json_string['shows'][0]['media:group'][0]['rte:server'] + json_string['shows'][0]['media:group'][0]['url']
        f4m_formats = self._extract_f4m_formats(f4m_url, video_id)
+        self._sort_formats(f4m_formats)

        return {
            'id': video_id,
--- a/youtube_dl/extractor/rtve.py
+++ b/youtube_dl/extractor/rtve.py
@@ -209,6 +209,7 @@ class RTVELiveIE(InfoExtractor):
        png = self._download_webpage(png_url, video_id, 'Downloading url information')
        m3u8_url = _decrypt_url(png)
        formats = self._extract_m3u8_formats(m3u8_url, video_id, ext='mp4')
+        self._sort_formats(formats)

        return {
            'id': video_id,
--- a/youtube_dl/extractor/rtvnh.py
+++ b/youtube_dl/extractor/rtvnh.py
@@ -38,6 +38,7 @@ class RTVNHIE(InfoExtractor):
                    item['file'], video_id, ext='mp4', entry_protocol='m3u8_native'))
            elif item.get('type') == '':
                formats.append({'url': item['file']})
+        self._sort_formats(formats)

        return {
            'id': video_id,
--- a/youtube_dl/extractor/shahid.py
+++ b/youtube_dl/extractor/shahid.py
@@ -77,6 +77,7 @@ class ShahidIE(InfoExtractor):
            raise ExtractorError('This video is DRM protected.', expected=True)

        formats = self._extract_m3u8_formats(player['url'], video_id, 'mp4')
+        self._sort_formats(formats)

        video = self._download_json(
            '%s/%s/%s?%s' % (
--- a/youtube_dl/extractor/sportbox.py
+++ b/youtube_dl/extractor/sportbox.py
@@ -99,6 +99,7 @@ class SportBoxEmbedIE(InfoExtractor):
            webpage, 'hls file')

        formats = self._extract_m3u8_formats(hls, video_id, 'mp4')
+        self._sort_formats(formats)

        title = self._search_regex(
            r'sportboxPlayer\.node_title\s*=\s*"([^"]+)"', webpage, 'title')
--- a/youtube_dl/extractor/telecinco.py
+++ b/youtube_dl/extractor/telecinco.py
@@ -82,6 +82,7 @@ class TelecincoIE(InfoExtractor):
        )
        formats = self._extract_m3u8_formats(
            token_info['tokenizedUrl'], episode, ext='mp4', entry_protocol='m3u8_native')
+        self._sort_formats(formats)

        return {
            'id': embed_data['videoId'],
--- a/youtube_dl/extractor/tenplay.py
+++ b/youtube_dl/extractor/tenplay.py
@@ -1,90 +0,0 @@
-# coding: utf-8
-from __future__ import unicode_literals
-
-from .common import InfoExtractor
-from ..utils import (
-    int_or_none,
-    float_or_none,
-)
-
-
-class TenPlayIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?ten(play)?\.com\.au/.+'
-    _TEST = {
-        'url': 'http://tenplay.com.au/ten-insider/extra/season-2013/tenplay-tv-your-way',
-        'info_dict': {
-            'id': '2695695426001',
-            'ext': 'flv',
-            'title': 'TENplay: TV your way',
-            'description': 'Welcome to a new TV experience. Enjoy a taste of the TENplay benefits.',
-            'timestamp': 1380150606.889,
-            'upload_date': '20130925',
-            'uploader': 'TENplay',
-        },
-        'params': {
-            'skip_download': True,  # Requires rtmpdump
-        }
-    }
-
-    _video_fields = [
-        'id', 'name', 'shortDescription', 'longDescription', 'creationDate',
-        'publishedDate', 'lastModifiedDate', 'customFields', 'videoStillURL',
-        'thumbnailURL', 'referenceId', 'length', 'playsTotal',
-        'playsTrailingWeek', 'renditions', 'captioning', 'startDate', 'endDate']
-
-    def _real_extract(self, url):
-        webpage = self._download_webpage(url, url)
-        video_id = self._html_search_regex(
-            r'videoID: "(\d+?)"', webpage, 'video_id')
-        api_token = self._html_search_regex(
-            r'apiToken: "([a-zA-Z0-9-_\.]+?)"', webpage, 'api_token')
-        title = self._html_search_regex(
-            r'<meta property="og:title" content="\s*(.*?)\s*"\s*/?\s*>',
-            webpage, 'title')
-
-        json = self._download_json('https://api.brightcove.com/services/library?command=find_video_by_id&video_id=%s&token=%s&video_fields=%s' % (video_id, api_token, ','.join(self._video_fields)), title)
-
-        formats = []
-        for rendition in json['renditions']:
-            url = rendition['remoteUrl'] or rendition['url']
-            protocol = 'rtmp' if url.startswith('rtmp') else 'http'
-            ext = 'flv' if protocol == 'rtmp' else rendition['videoContainer'].lower()
-
-            if protocol == 'rtmp':
-                url = url.replace('&mp4:', '')
-
-                tbr = int_or_none(rendition.get('encodingRate'), 1000)
-
-            formats.append({
-                'format_id': '_'.join(
-                    ['rtmp', rendition['videoContainer'].lower(),
-                     rendition['videoCodec'].lower(), '%sk' % tbr]),
-                'width': int_or_none(rendition['frameWidth']),
-                'height': int_or_none(rendition['frameHeight']),
-                'tbr': tbr,
-                'filesize': int_or_none(rendition['size']),
-                'protocol': protocol,
-                'ext': ext,
-                'vcodec': rendition['videoCodec'].lower(),
-                'container': rendition['videoContainer'].lower(),
-                'url': url,
-            })
-        self._sort_formats(formats)
-
-        return {
-            'id': video_id,
-            'display_id': json['referenceId'],
-            'title': json['name'],
-            'description': json['shortDescription'] or json['longDescription'],
-            'formats': formats,
-            'thumbnails': [{
-                'url': json['videoStillURL']
-            }, {
-                'url': json['thumbnailURL']
-            }],
-            'thumbnail': json['videoStillURL'],
-            'duration': float_or_none(json.get('length'), 1000),
-            'timestamp': float_or_none(json.get('creationDate'), 1000),
-            'uploader': json.get('customFields', {}).get('production_company_distributor') or 'TENplay',
-            'view_count': int_or_none(json.get('playsTotal')),
-        }
--- a/youtube_dl/extractor/theplatform.py
+++ b/youtube_dl/extractor/theplatform.py
@@ -82,7 +82,7 @@ class ThePlatformBaseIE(OnceIE):
 class ThePlatformIE(ThePlatformBaseIE):
    _VALID_URL = r'''(?x)
        (?:https?://(?:link|player)\.theplatform\.com/[sp]/(?P<provider_id>[^/]+)/
-           (?:(?P<media>(?:(?:[^/]+/)+select/)?media/)|(?P<config>(?:[^/\?]+/(?:swf|config)|onsite)/select/))?
+           (?:(?:(?:[^/]+/)+select/)?(?P<media>media/(?:guid/\d+/)?)|(?P<config>(?:[^/\?]+/(?:swf|config)|onsite)/select/))?
         |theplatform:)(?P<id>[^/\?&]+)'''

    _TESTS = [{
@@ -170,10 +170,10 @@ class ThePlatformIE(ThePlatformBaseIE):
        if not provider_id:
            provider_id = 'dJ5BDC'

-        path = provider_id
+        path = provider_id + '/'
        if mobj.group('media'):
-            path += '/media'
-        path += '/' + video_id
+            path += mobj.group('media')
+        path += video_id

        qs_dict = compat_parse_qs(compat_urllib_parse_urlparse(url).query)
        if 'guid' in qs_dict:
--- a/youtube_dl/extractor/tubitv.py
+++ b/youtube_dl/extractor/tubitv.py
@@ -69,6 +69,7 @@ class TubiTvIE(InfoExtractor):
        apu = self._search_regex(r"apu='([^']+)'", webpage, 'apu')
        m3u8_url = codecs.decode(apu, 'rot_13')[::-1]
        formats = self._extract_m3u8_formats(m3u8_url, video_id, ext='mp4')
+        self._sort_formats(formats)

        return {
            'id': video_id,
--- a/youtube_dl/extractor/tudou.py
+++ b/youtube_dl/extractor/tudou.py
@@ -5,6 +5,7 @@ from __future__ import unicode_literals
 from .common import InfoExtractor
 from ..compat import compat_str
 from ..utils import (
+    ExtractorError,
    int_or_none,
    InAdvancePagedList,
    float_or_none,
@@ -46,6 +47,19 @@ class TudouIE(InfoExtractor):

    _PLAYER_URL = 'http://js.tudouui.com/bin/lingtong/PortalPlayer_177.swf'

+    # Translated from tudou/tools/TVCHelper.as in PortalPlayer_193.swf
+    # 0001, 0002 and 4001 are not included as they indicate temporary issues
+    TVC_ERRORS = {
+        '0003': 'The video is deleted or does not exist',
+        '1001': 'This video is unavailable due to licensing issues',
+        '1002': 'This video is unavailable as it\'s under review',
+        '1003': 'This video is unavailable as it\'s under review',
+        '3001': 'Password required',
+        '5001': 'This video is available in Mainland China only due to licensing issues',
+        '7001': 'This video is unavailable',
+        '8001': 'This video is unavailable due to licensing issues',
+    }
+
    def _url_for_id(self, video_id, quality=None):
        info_url = 'http://v2.tudou.com/f?id=' + compat_str(video_id)
        if quality:
@@ -63,6 +77,15 @@ class TudouIE(InfoExtractor):
        if youku_vcode:
            return self.url_result('youku:' + youku_vcode, ie='Youku')

+        if not item_data.get('itemSegs'):
+            tvc_code = item_data.get('tvcCode')
+            if tvc_code:
+                err_msg = self.TVC_ERRORS.get(tvc_code)
+                if err_msg:
+                    raise ExtractorError('Tudou said: %s' % err_msg, expected=True)
+                raise ExtractorError('Unexpected error %s returned from Tudou' % tvc_code)
+            raise ExtractorError('Unxpected error returned from Tudou')
+
        title = unescapeHTML(item_data['kw'])
        description = item_data.get('desc')
        thumbnail_url = item_data.get('pic')
--- a/youtube_dl/extractor/tumblr.py
+++ b/youtube_dl/extractor/tumblr.py
@@ -8,7 +8,7 @@ from ..utils import int_or_none


 class TumblrIE(InfoExtractor):
-    _VALID_URL = r'https?://(?P<blog_name>.*?)\.tumblr\.com/(?:post|video)/(?P<id>[0-9]+)(?:$|[/?#])'
+    _VALID_URL = r'https?://(?P<blog_name>[^/?#&]+)\.tumblr\.com/(?:post|video)/(?P<id>[0-9]+)(?:$|[/?#])'
    _TESTS = [{
        'url': 'http://tatianamaslanydaily.tumblr.com/post/54196191430/orphan-black-dvd-extra-behind-the-scenes',
        'md5': '479bb068e5b16462f5176a6828829767',
--- a/youtube_dl/extractor/twitter.py
+++ b/youtube_dl/extractor/twitter.py
@@ -102,6 +102,9 @@ class TwitterCardIE(TwitterBaseIE):
            r'data-(?:player-)?config="([^"]+)"', webpage, 'data player config'),
            video_id)

+        if config.get('source_type') == 'vine':
+            return self.url_result(config['player_url'], 'Vine')
+
        def _search_dimensions_in_video_url(a_format, video_url):
            m = re.search(r'/(?P<width>\d+)x(?P<height>\d+)/', video_url)
            if m:
@@ -110,10 +113,9 @@ class TwitterCardIE(TwitterBaseIE):
                    'height': int(m.group('height')),
                })

-        playlist = config.get('playlist')
-        if playlist:
-            video_url = playlist[0]['source']
+        video_url = config.get('video_url') or config.get('playlist', [{}])[0].get('source')

+        if video_url:
            f = {
                'url': video_url,
            }
@@ -185,7 +187,6 @@ class TwitterIE(InfoExtractor):
            'ext': 'mp4',
            'title': 'FREE THE NIPPLE - FTN supporters on Hollywood Blvd today!',
            'thumbnail': 're:^https?://.*\.jpg',
-            'duration': 12.922,
            'description': 'FREE THE NIPPLE on Twitter: "FTN supporters on Hollywood Blvd today! http://t.co/c7jHH749xJ"',
            'uploader': 'FREE THE NIPPLE',
            'uploader_id': 'freethenipple',
@@ -247,6 +248,18 @@ class TwitterIE(InfoExtractor):
        'params': {
            'skip_download': True,  # requires ffmpeg
        },
+    }, {
+        'url': 'https://twitter.com/Filmdrunk/status/713801302971588609',
+        'md5': '89a15ed345d13b86e9a5a5e051fa308a',
+        'info_dict': {
+            'id': 'MIOxnrUteUd',
+            'ext': 'mp4',
+            'title': 'Dr.Pepperの飲み方 #japanese #バカ #ドクペ #電動ガン',
+            'uploader': 'TAKUMA',
+            'uploader_id': '1004126642786242560',
+            'upload_date': '20140615',
+        },
+        'add_ie': ['Vine'],
    }]

    def _real_extract(self, url):
--- a/youtube_dl/extractor/udemy.py
+++ b/youtube_dl/extractor/udemy.py
@@ -1,5 +1,7 @@
 from __future__ import unicode_literals

+import re
+
 from .common import InfoExtractor
 from ..compat import (
    compat_HTTPError,
@@ -8,6 +10,8 @@ from ..compat import (
    compat_urlparse,
 )
 from ..utils import (
+    determine_ext,
+    extract_attributes,
    ExtractorError,
    float_or_none,
    int_or_none,
@@ -50,22 +54,37 @@ class UdemyIE(InfoExtractor):
        'only_matching': True,
    }]

+    def _extract_course_info(self, webpage, video_id):
+        course = self._parse_json(
+            unescapeHTML(self._search_regex(
+                r'ng-init=["\'].*\bcourse=({.+?});', webpage, 'course', default='{}')),
+            video_id, fatal=False) or {}
+        course_id = course.get('id') or self._search_regex(
+            (r'&quot;id&quot;\s*:\s*(\d+)', r'data-course-id=["\'](\d+)'),
+            webpage, 'course id')
+        return course_id, course.get('title')
+
    def _enroll_course(self, base_url, webpage, course_id):
+        def combine_url(base_url, url):
+            return compat_urlparse.urljoin(base_url, url) if not url.startswith('http') else url
+
        checkout_url = unescapeHTML(self._search_regex(
-            r'href=(["\'])(?P<url>https?://(?:www\.)?udemy\.com/payment/checkout/.+?)\1',
+            r'href=(["\'])(?P<url>(?:https?://(?:www\.)?udemy\.com)?/payment/checkout/.+?)\1',
            webpage, 'checkout url', group='url', default=None))
        if checkout_url:
            raise ExtractorError(
                'Course %s is not free. You have to pay for it before you can download. '
-                'Use this URL to confirm purchase: %s' % (course_id, checkout_url), expected=True)
+                'Use this URL to confirm purchase: %s'
+                % (course_id, combine_url(base_url, checkout_url)),
+                expected=True)

        enroll_url = unescapeHTML(self._search_regex(
            r'href=(["\'])(?P<url>(?:https?://(?:www\.)?udemy\.com)?/course/subscribe/.+?)\1',
            webpage, 'enroll url', group='url', default=None))
        if enroll_url:
-            if not enroll_url.startswith('http'):
-                enroll_url = compat_urlparse.urljoin(base_url, enroll_url)
-            webpage = self._download_webpage(enroll_url, course_id, 'Enrolling in the course')
+            webpage = self._download_webpage(
+                combine_url(base_url, enroll_url),
+                course_id, 'Enrolling in the course')
            if '>You have enrolled in' in webpage:
                self.to_screen('%s: Successfully enrolled in the course' % course_id)

@@ -73,11 +92,8 @@ class UdemyIE(InfoExtractor):
        return self._download_json(
            'https://www.udemy.com/api-2.0/users/me/subscribed-courses/%s/lectures/%s?%s' % (
                course_id, lecture_id, compat_urllib_parse_urlencode({
-                    'video_only': '',
-                    'auto_play': '',
-                    'fields[lecture]': 'title,description,asset',
+                    'fields[lecture]': 'title,description,view_html,asset',
                    'fields[asset]': 'asset_type,stream_url,thumbnail_url,download_urls,data',
-                    'instructorPreviewMode': 'False',
                })),
            lecture_id, 'Downloading lecture JSON')

@@ -92,7 +108,7 @@ class UdemyIE(InfoExtractor):
                error_str += ' - %s' % error_data.get('formErrors')
            raise ExtractorError(error_str, expected=True)

-    def _download_json(self, url_or_request, video_id, note='Downloading JSON metadata'):
+    def _download_json(self, url_or_request, *args, **kwargs):
        headers = {
            'X-Udemy-Snail-Case': 'true',
            'X-Requested-With': 'XMLHttpRequest',
@@ -110,7 +126,7 @@ class UdemyIE(InfoExtractor):
        else:
            url_or_request = sanitized_Request(url_or_request, headers=headers)

-        response = super(UdemyIE, self)._download_json(url_or_request, video_id, note)
+        response = super(UdemyIE, self)._download_json(url_or_request, *args, **kwargs)
        self._handle_error(response)
        return response

@@ -160,9 +176,7 @@ class UdemyIE(InfoExtractor):

        webpage = self._download_webpage(url, lecture_id)

-        course_id = self._search_regex(
-            (r'data-course-id=["\'](\d+)', r'&quot;id&quot;\s*:\s*(\d+)'),
-            webpage, 'course id')
+        course_id, _ = self._extract_course_info(webpage, lecture_id)

        try:
            lecture = self._download_lecture(course_id, lecture_id)
@@ -200,7 +214,7 @@ class UdemyIE(InfoExtractor):
        def extract_output_format(src):
            return {
                'url': src['url'],
-                'format_id': '%sp' % (src.get('label') or format_id),
+                'format_id': '%sp' % (src.get('height') or format_id),
                'width': int_or_none(src.get('width')),
                'height': int_or_none(src.get('height')),
                'vbr': int_or_none(src.get('video_bitrate_in_kbps')),
@@ -217,9 +231,13 @@ class UdemyIE(InfoExtractor):
        if not isinstance(outputs, dict):
            outputs = {}

-        for format_id, output in outputs.items():
-            if isinstance(output, dict) and output.get('url'):
-                formats.append(extract_output_format(output))
+        def add_output_format_meta(f, key):
+            output = outputs.get(key)
+            if isinstance(output, dict):
+                output_format = extract_output_format(output)
+                output_format.update(f)
+                return output_format
+            return f

        download_urls = asset.get('download_urls')
        if isinstance(download_urls, dict):
@@ -232,21 +250,48 @@ class UdemyIE(InfoExtractor):
                    format_id = format_.get('label')
                    f = {
                        'url': format_['file'],
+                        'format_id': '%sp' % format_id,
                        'height': int_or_none(format_id),
                    }
                    if format_id:
                        # Some videos contain additional metadata (e.g.
                        # https://www.udemy.com/ios9-swift/learn/#/lecture/3383208)
-                        output = outputs.get(format_id)
-                        if isinstance(output, dict):
-                            output_format = extract_output_format(output)
-                            output_format.update(f)
-                            f = output_format
-                        else:
-                            f['format_id'] = '%sp' % format_id
+                        f = add_output_format_meta(f, format_id)
                    formats.append(f)

-        self._sort_formats(formats)
+        view_html = lecture.get('view_html')
+        if view_html:
+            view_html_urls = set()
+            for source in re.findall(r'<source[^>]+>', view_html):
+                attributes = extract_attributes(source)
+                src = attributes.get('src')
+                if not src:
+                    continue
+                res = attributes.get('data-res')
+                height = int_or_none(res)
+                if src in view_html_urls:
+                    continue
+                view_html_urls.add(src)
+                if attributes.get('type') == 'application/x-mpegURL' or determine_ext(src) == 'm3u8':
+                    m3u8_formats = self._extract_m3u8_formats(
+                        src, video_id, 'mp4', entry_protocol='m3u8_native',
+                        m3u8_id='hls', fatal=False)
+                    for f in m3u8_formats:
+                        m = re.search(r'/hls_(?P<height>\d{3,4})_(?P<tbr>\d{2,})/', f['url'])
+                        if m:
+                            if not f.get('height'):
+                                f['height'] = int(m.group('height'))
+                            if not f.get('tbr'):
+                                f['tbr'] = int(m.group('tbr'))
+                    formats.extend(m3u8_formats)
+                else:
+                    formats.append(add_output_format_meta({
+                        'url': src,
+                        'format_id': '%dp' % height if height else None,
+                        'height': height,
+                    }, res))
+
+        self._sort_formats(formats, field_preference=('height', 'width', 'tbr', 'format_id'))

        return {
            'id': video_id,
@@ -260,7 +305,7 @@ class UdemyIE(InfoExtractor):

 class UdemyCourseIE(UdemyIE):
    IE_NAME = 'udemy:course'
-    _VALID_URL = r'https?://www\.udemy\.com/(?P<id>[\da-z-]+)'
+    _VALID_URL = r'https?://www\.udemy\.com/(?P<id>[^/?#&]+)'
    _TESTS = []

    @classmethod
@@ -272,29 +317,29 @@ class UdemyCourseIE(UdemyIE):

        webpage = self._download_webpage(url, course_path)

-        response = self._download_json(
-            'https://www.udemy.com/api-1.1/courses/%s' % course_path,
-            course_path, 'Downloading course JSON')
-
-        course_id = response['id']
-        course_title = response.get('title')
+        course_id, title = self._extract_course_info(webpage, course_path)

        self._enroll_course(url, webpage, course_id)

        response = self._download_json(
-            'https://www.udemy.com/api-1.1/courses/%s/curriculum' % course_id,
-            course_id, 'Downloading course curriculum')
+            'https://www.udemy.com/api-2.0/courses/%s/cached-subscriber-curriculum-items' % course_id,
+            course_id, 'Downloading course curriculum', query={
+                'fields[chapter]': 'title,object_index',
+                'fields[lecture]': 'title',
+                'page_size': '1000',
+            })

        entries = []
-        chapter, chapter_number = None, None
-        for asset in response:
-            asset_type = asset.get('assetType') or asset.get('asset_type')
-            if asset_type == 'Video':
-                asset_id = asset.get('id')
-                if asset_id:
+        chapter, chapter_number = [None] * 2
+        for entry in response['results']:
+            clazz = entry.get('_class')
+            if clazz == 'lecture':
+                lecture_id = entry.get('id')
+                if lecture_id:
                    entry = {
                        '_type': 'url_transparent',
-                        'url': 'https://www.udemy.com/%s/#/lecture/%s' % (course_path, asset['id']),
+                        'url': 'https://www.udemy.com/%s/learn/v4/t/lecture/%s' % (course_path, entry['id']),
+                        'title': entry.get('title'),
                        'ie_key': UdemyIE.ie_key(),
                    }
                    if chapter_number:
@@ -302,8 +347,8 @@ class UdemyCourseIE(UdemyIE):
                    if chapter:
                        entry['chapter'] = chapter
                    entries.append(entry)
-            elif asset.get('type') == 'chapter':
-                chapter_number = asset.get('index') or asset.get('object_index')
-                chapter = asset.get('title')
+            elif clazz == 'chapter':
+                chapter_number = entry.get('object_index')
+                chapter = entry.get('title')

-        return self.playlist_result(entries, course_id, course_title)
+        return self.playlist_result(entries, course_id, title)
--- a/youtube_dl/extractor/vevo.py
+++ b/youtube_dl/extractor/vevo.py
@@ -152,7 +152,7 @@ class VevoIE(InfoExtractor):
    def _real_extract(self, url):
        video_id = self._match_id(url)

-        json_url = 'http://videoplayer.vevo.com/VideoService/AuthenticateVideo?isrc=%s' % video_id
+        json_url = 'http://api.vevo.com/VideoService/AuthenticateVideo?isrc=%s' % video_id
        response = self._download_json(
            json_url, video_id, 'Downloading video info', 'Unable to download info')
        video_info = response.get('video') or {}
--- a/youtube_dl/extractor/videomore.py
+++ b/youtube_dl/extractor/videomore.py
@@ -111,6 +111,7 @@ class VideomoreIE(InfoExtractor):

        video_url = xpath_text(video, './/video_url', 'video url', fatal=True)
        formats = self._extract_f4m_formats(video_url, video_id, f4m_id='hds')
+        self._sort_formats(formats)

        data = self._download_json(
            'http://videomore.ru/video/tracks/%s.json' % video_id,
--- a/youtube_dl/extractor/vier.py
+++ b/youtube_dl/extractor/vier.py
@@ -50,6 +50,7 @@ class VierIE(InfoExtractor):

        playlist_url = 'http://vod.streamcloud.be/%s/mp4:_definst_/%s.mp4/playlist.m3u8' % (application, filename)
        formats = self._extract_m3u8_formats(playlist_url, display_id, 'mp4')
+        self._sort_formats(formats)

        title = self._og_search_title(webpage, default=display_id)
        description = self._og_search_description(webpage, default=None)
--- a/youtube_dl/extractor/viidea.py
+++ b/youtube_dl/extractor/viidea.py
@@ -151,6 +151,7 @@ class ViideaIE(InfoExtractor):
                smil_url = '%s/%s/video/%s/smil.xml' % (base_url, lecture_slug, part_id)
                smil = self._download_smil(smil_url, lecture_id)
                info = self._parse_smil(smil, smil_url, lecture_id)
+                self._sort_formats(info['formats'])
                info['id'] = lecture_id if not multipart else '%s_part%s' % (lecture_id, part_id)
                info['display_id'] = lecture_slug if not multipart else '%s_part%s' % (lecture_slug, part_id)
                if multipart:
--- a/youtube_dl/extractor/voxmedia.py
+++ b/youtube_dl/extractor/voxmedia.py
@@ -0,0 +1,132 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..compat import compat_urllib_parse_unquote
+
+
+class VoxMediaIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?(?:theverge|vox|sbnation|eater|polygon|curbed|racked)\.com/(?:[^/]+/)*(?P<id>[^/?]+)'
+    _TESTS = [{
+        'url': 'http://www.theverge.com/2014/6/27/5849272/material-world-how-google-discovered-what-software-is-made-of',
+        'md5': '73856edf3e89a711e70d5cf7cb280b37',
+        'info_dict': {
+            'id': '11eXZobjrG8DCSTgrNjVinU-YmmdYjhe',
+            'ext': 'mp4',
+            'title': 'Google\'s new material design direction',
+            'description': 'md5:2f44f74c4d14a1f800ea73e1c6832ad2',
+        }
+    }, {
+        # data-ooyala-id
+        'url': 'http://www.theverge.com/2014/10/21/7025853/google-nexus-6-hands-on-photos-video-android-phablet',
+        'md5': 'd744484ff127884cd2ba09e3fa604e4b',
+        'info_dict': {
+            'id': 'RkZXU4cTphOCPDMZg5oEounJyoFI0g-B',
+            'ext': 'mp4',
+            'title': 'The Nexus 6: hands-on with Google\'s phablet',
+            'description': 'md5:87a51fe95ff8cea8b5bdb9ac7ae6a6af',
+        }
+    }, {
+        # volume embed
+        'url': 'http://www.vox.com/2016/3/31/11336640/mississippi-lgbt-religious-freedom-bill',
+        'md5': '375c483c5080ab8cd85c9c84cfc2d1e4',
+        'info_dict': {
+            'id': 'wydzk3dDpmRz7PQoXRsTIX6XTkPjYL0b',
+            'ext': 'mp4',
+            'title': 'The new frontier of LGBTQ civil rights, explained',
+            'description': 'md5:0dc58e94a465cbe91d02950f770eb93f',
+        }
+    }, {
+        # youtube embed
+        'url': 'http://www.vox.com/2016/3/24/11291692/robot-dance',
+        'md5': '83b3080489fb103941e549352d3e0977',
+        'info_dict': {
+            'id': 'FcNHTJU1ufM',
+            'ext': 'mp4',
+            'title': 'How "the robot" became the greatest novelty dance of all time',
+            'description': 'md5:b081c0d588b8b2085870cda55e6da176',
+            'upload_date': '20160324',
+            'uploader_id': 'voxdotcom',
+            'uploader': 'Vox',
+        }
+    }, {
+        # SBN.VideoLinkset.entryGroup multiple ooyala embeds
+        'url': 'http://www.sbnation.com/college-football-recruiting/2015/2/3/7970291/national-signing-day-rationalizations-itll-be-ok-itll-be-ok',
+        'info_dict': {
+            'id': 'national-signing-day-rationalizations-itll-be-ok-itll-be-ok',
+            'title': '25 lies you will tell yourself on National Signing Day',
+            'description': 'It\'s the most self-delusional time of the year, and everyone\'s gonna tell the same lies together!',
+        },
+        'playlist': [{
+            'md5': '721fededf2ab74ae4176c8c8cbfe092e',
+            'info_dict': {
+                'id': 'p3cThlMjE61VDi_SD9JlIteSNPWVDBB9',
+                'ext': 'mp4',
+                'title': 'Buddy Hield vs Steph Curry (and the world)',
+                'description': 'Let’s dissect only the most important Final Four storylines.',
+            },
+        }, {
+            'md5': 'bf0c5cc115636af028be1bab79217ea9',
+            'info_dict': {
+                'id': 'BmbmVjMjE6esPHxdALGubTrouQ0jYLHj',
+                'ext': 'mp4',
+                'title': 'Chasing Cinderella 2016: Syracuse basketball',
+                'description': 'md5:e02d56b026d51aa32c010676765a690d',
+            },
+        }],
+    }]
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+        webpage = compat_urllib_parse_unquote(self._download_webpage(url, display_id))
+
+        def create_entry(provider_video_id, provider_video_type, title=None, description=None):
+            return {
+                '_type': 'url_transparent',
+                'url': provider_video_id if provider_video_type == 'youtube' else '%s:%s' % (provider_video_type, provider_video_id),
+                'title': title or self._og_search_title(webpage),
+                'description': description or self._og_search_description(webpage),
+            }
+
+        entries = []
+        entries_data = self._search_regex([
+            r'Chorus\.VideoContext\.addVideo\((\[{.+}\])\);',
+            r'var\s+entry\s*=\s*({.+});',
+            r'SBN\.VideoLinkset\.entryGroup\(\s*(\[.+\])',
+        ], webpage, 'video data', default=None)
+        if entries_data:
+            entries_data = self._parse_json(entries_data, display_id)
+            if isinstance(entries_data, dict):
+                entries_data = [entries_data]
+            for video_data in entries_data:
+                provider_video_id = video_data.get('provider_video_id')
+                provider_video_type = video_data.get('provider_video_type')
+                if provider_video_id and provider_video_type:
+                    entries.append(create_entry(
+                        provider_video_id, provider_video_type,
+                        video_data.get('title'), video_data.get('description')))
+
+        provider_video_id = self._search_regex(
+            r'data-ooyala-id="([^"]+)"', webpage, 'ooyala id', default=None)
+        if provider_video_id:
+            entries.append(create_entry(provider_video_id, 'ooyala'))
+
+        volume_uuid = self._search_regex(
+            r'data-volume-uuid="([^"]+)"', webpage, 'volume uuid', default=None)
+        if volume_uuid:
+            volume_webpage = self._download_webpage(
+                'http://volume.vox-cdn.com/embed/%s' % volume_uuid, volume_uuid)
+            video_data = self._parse_json(self._search_regex(
+                r'Volume\.createVideo\(({.+})\s*,\s*{.*}\);', volume_webpage, 'video data'), volume_uuid)
+            for provider_video_type in ('ooyala', 'youtube'):
+                provider_video_id = video_data.get('%s_id' % provider_video_type)
+                if provider_video_id:
+                    description = video_data.get('description_long') or video_data.get('description_short')
+                    entries.append(create_entry(
+                        provider_video_id, provider_video_type, video_data.get('title_short'), description))
+                    break
+
+        if len(entries) == 1:
+            return entries[0]
+        else:
+            return self.playlist_result(entries, display_id, self._og_search_title(webpage), self._og_search_description(webpage))
--- a/youtube_dl/extractor/ynet.py
+++ b/youtube_dl/extractor/ynet.py
@@ -41,10 +41,12 @@ class YnetIE(InfoExtractor):
        m = re.search(r'ynet - HOT -- (["\']+)(?P<title>.+?)\1', title)
        if m:
            title = m.group('title')
+        formats = self._extract_f4m_formats(f4m_url, video_id)
+        self._sort_formats(formats)

        return {
            'id': video_id,
            'title': title,
-            'formats': self._extract_f4m_formats(f4m_url, video_id),
+            'formats': formats,
            'thumbnail': self._og_search_thumbnail(webpage),
        }
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@@ -234,7 +234,9 @@ class YoutubePlaylistBaseInfoExtractor(YoutubeEntryListBaseInfoExtractor):

 class YoutubePlaylistsBaseInfoExtractor(YoutubeEntryListBaseInfoExtractor):
    def _process_page(self, content):
-        for playlist_id in orderedSet(re.findall(r'href="/?playlist\?list=([0-9A-Za-z-_]{10,})"', content)):
+        for playlist_id in orderedSet(re.findall(
+                r'<h3[^>]+class="[^"]*yt-lockup-title[^"]*"[^>]*><a[^>]+href="/?playlist\?list=([0-9A-Za-z-_]{10,})"',
+                content)):
            yield self.url_result(
                'https://www.youtube.com/playlist?list=%s' % playlist_id, 'YoutubePlaylist')

--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@@ -417,9 +417,12 @@ def sanitize_path(s):

 # Prepend protocol-less URLs with `http:` scheme in order to mitigate the number of
 # unwanted failures due to missing protocol
+def sanitize_url(url):
+    return 'http:%s' % url if url.startswith('//') else url
+
+
 def sanitized_Request(url, *args, **kwargs):
-    return compat_urllib_request.Request(
-        'http:%s' % url if url.startswith('//') else url, *args, **kwargs)
+    return compat_urllib_request.Request(sanitize_url(url), *args, **kwargs)


 def orderedSet(iterable):
@@ -775,12 +778,7 @@ class YoutubeDLHandler(compat_urllib_request.HTTPHandler):

        # Substitute URL if any change after escaping
        if url != url_escaped:
-            req_type = HEADRequest if req.get_method() == 'HEAD' else compat_urllib_request.Request
-            new_req = req_type(
-                url_escaped, data=req.data, headers=req.headers,
-                origin_req_host=req.origin_req_host, unverifiable=req.unverifiable)
-            new_req.timeout = req.timeout
-            req = new_req
+            req = update_Request(req, url=url_escaped)

        for h, v in std_headers.items():
            # Capitalize is needed because of Python bug 2275: http://bugs.python.org/issue2275
@@ -1801,6 +1799,20 @@ def update_url_query(url, query):
        query=compat_urllib_parse_urlencode(qs, True)))


+def update_Request(req, url=None, data=None, headers={}, query={}):
+    req_headers = req.headers.copy()
+    req_headers.update(headers)
+    req_data = data or req.data
+    req_url = update_url_query(url or req.get_full_url(), query)
+    req_type = HEADRequest if req.get_method() == 'HEAD' else compat_urllib_request.Request
+    new_req = req_type(
+        req_url, data=req_data, headers=req_headers,
+        origin_req_host=req.origin_req_host, unverifiable=req.unverifiable)
+    if hasattr(req, 'timeout'):
+        new_req.timeout = req.timeout
+    return new_req
+
+
 def dict_get(d, key_or_keys, default=None, skip_false_values=True):
    if isinstance(key_or_keys, (list, tuple)):
        for key in key_or_keys:
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2016.03.26'
+__version__ = '2016.04.01'
Author	SHA1	Message	Date
Philipp Hagemeister	1e02bc7ba2	release 2016.04.01	2016-04-01 09:07:40 +02:00
remitamine	63c55e9f22	[cbs] improve extraction(closes #6321 )	2016-04-01 07:33:37 +01:00
remitamine	f9b1529af8	[generic] remove sbnation test(handled by VoxMediaIE)	2016-03-31 23:50:45 +01:00
remitamine	961fc024d2	[voxmedia] improve sbnation support	2016-03-31 23:33:36 +01:00
Sergey M․	b53a06e3b9	[udemy:course] Use new URL format	2016-04-01 02:24:22 +06:00
remitamine	4ecc1fc638	[howstuffworks] improve extraction	2016-03-31 21:11:58 +01:00
Yen Chi Hsuan	5b012dfce8	[tudou] Improve error handling (closes #8988 )	2016-04-01 01:42:16 +08:00
remitamine	8369942773	[voxmedia] Add new extractor(closes #3182 )	2016-03-31 18:36:41 +01:00
Sergey M․	86f3b66cec	[udemy] Remove unused import	2016-03-31 23:00:11 +06:00
Sergey M․	6bb4600717	[udemy:course] Simplify course curriculum downloading	2016-03-31 22:59:19 +06:00
Sergey M․	41d06b0424	[extractor/common] Improve _request_webpage * Do not ignore data, headers and query for Requests * Default values for headers and query switched to dicts since these are used by urllib itself	2016-03-31 22:58:38 +06:00
Sergey M․	15d260ebaa	[utils] Use update_Request in http_request	2016-03-31 22:55:49 +06:00
Sergey M․	ed0291d153	[utils] Add update_Request	2016-03-31 22:55:01 +06:00
Sergey M․	81da8cbc45	[udemy] Switch to api 2.0 (Closes #9035 )	2016-03-31 22:05:25 +06:00
Sergey M․	5299bc3f91	[beeg] Switch to api v6 (Closes #9036 )	2016-03-31 20:42:41 +06:00
remitamine	c9c39c22c5	[nationalgeographic] add support for channel.nationalgeographic.com urls	2016-03-31 13:47:38 +01:00
remitamine	d84b48e3f1	[nationalgeographic] improve extraction	2016-03-31 13:44:55 +01:00
remitamine	dd17041c82	[tenplay] remove extractor(fixes #6927 )	2016-03-31 12:02:04 +01:00
remitamine	fea7295b14	[brightcove] relax embed_in_page regex	2016-03-31 10:48:22 +01:00
remitamine	9cf01f7f30	[nbc] add new extractor for csnne.com(#5432 )	2016-03-31 00:26:42 +01:00
remitamine	ce548296fe	[cnbc] fix test	2016-03-31 00:25:11 +01:00
remitamine	c02ec7d430	[cnbc] Add new extractor(closes #8012 )	2016-03-30 23:18:31 +01:00
remitamine	6b820a2376	[myspace] improve extraction	2016-03-30 21:18:07 +01:00
Yen Chi Hsuan	e621a344e6	[kwuo] Port to new API and enable --cn-verification-proxy	2016-03-31 02:27:52 +08:00
Yen Chi Hsuan	3ae6f8fec1	[kwuo] Remove _sort_formats() from KuwoBaseIE._get_formats() Following the idea proposed in `19dbaeece3`	2016-03-31 02:11:21 +08:00
Yen Chi Hsuan	597d52fadb	[kuwo:song] Correct song ID extraction (fixes #9033 ) Bug introduced in `daef04a4e7`.	2016-03-31 02:00:50 +08:00
Sergey M․	afca767d19	[tumblr] Improve _VALID_URL (Closes #9027 )	2016-03-30 22:26:43 +06:00
remitamine	6e359a1534	[comcarcoff] don not depend on crackle extractor(closes #8995 ) previously extraction has been delegated to crackle to extract more info and subtitles #6106 but some of the episodes can't be extracted using crackle #8995.	2016-03-30 12:27:00 +01:00
Sergey M․	607619bc90	Add manually generated ISSUE_TEMPLATE.md In order not to wait for the next release	2016-03-29 22:04:29 +06:00
Sergey M․	0b7bfc9422	Improve ISSUE_TEMPLATE_tmpl.md	2016-03-29 22:02:42 +06:00
Sergey M․	7168a6c874	[devscripts/make_issue_template] Fix __version__ again	2016-03-29 03:05:15 +06:00
Sergey M․	034947dd1e	Rename ISSUE_TEMPLATE.tmpl in order not to be picked up by github	2016-03-29 02:48:04 +06:00
Sergey M․	3c0de33ad7	Remove ISSUE_TEMPLATE.md	2016-03-29 02:43:48 +06:00
Sergey M․	89924f8230	[devscripts/make_issue_template] Fix NameError under python3	2016-03-29 02:41:27 +06:00
Sergey M․	a39c68f7e5	Exclude make_issue_template.py from flake8	2016-03-29 02:19:24 +06:00
Sergey M․	4a5a67ca25	[devscripts/release.sh] Make ISSUE_TEMPLATE.md and commit it	2016-03-29 02:18:52 +06:00
Sergey M․	8751da85a7	[Makefile] Fix ISSUE_TEMPLATE.md target	2016-03-29 02:17:57 +06:00
Sergey M․	3bf1df51fd	[devscripts/make_issue_template] Rework to use ISSUE_TEMPLATE.tmpl (Closes #8785 )	2016-03-29 02:16:38 +06:00
Sergey M․	3842a3e652	Add ISSUE_TEMPLATE.tmpl as template for ISSUE_TEMPLATE.md	2016-03-29 02:15:26 +06:00
Sander van den Oever	7710bdf4e8	Add initial ISSUE_TEMPLATE Add auto-updating of youtube-dl version in ISSUE_TEMPLATE Move parts of template text and adopt makefile to new format Moved the 'kind-of-issue' section and rephrased a bit Rephrased and moved Example URL section upwards Moved ISSUE_TEMPLATE inside .github folder. Update makefile to match new folderstructure	2016-03-28 22:43:13 +06:00
Sergey M	8d9dd3c34b	[README.md] Add format_id to the list of string meta fields available for use in format selection	2016-03-28 03:08:34 +05:00
Sergey M․	33f3040a3e	[YoutubeDL] Fix sanitizing subtitles' url	2016-03-28 03:13:39 +06:00
Sergey M․	03442072c0	[pornhub] Fix typo (Closes #9008 )	2016-03-28 01:21:44 +06:00
Sergey M․	c8b13fec02	[foxnews] Restore upload time fields in test	2016-03-28 01:14:12 +06:00
Sergey M․	87d105ac6c	[amp] Fix upload timestamp extraction (Closes #9007 )	2016-03-28 01:13:47 +06:00
Sergey M․	3454139576	[pornhub:uservideos] Add support for multipage videos (Closes #9006 )	2016-03-28 00:50:46 +06:00
Sergey M․	3a23bae9cc	[pornhub:playlistbase] Do not include videos not from playlist	2016-03-28 00:32:57 +06:00
Sergey M․	8f9a477e7f	[pornhub:playlistbase] Use orderedSet	2016-03-28 00:21:08 +06:00
Sergey M․	a1cf3e38a3	[bbc] Extend vpid regex (Closes #9003 )	2016-03-27 23:22:51 +06:00
Philipp Hagemeister	a122e7080b	release 2016.03.27	2016-03-27 16:56:33 +02:00
Sergey M․	b22ca76204	[extractor/common] Filter out unsupported encrypted media for f4m formats (Closes #8573 )	2016-03-27 07:42:38 +06:00
Sergey M․	f7df343b4a	[downloader/f4m] Extract routine for removing unsupported encrypted media	2016-03-27 07:41:19 +06:00
Sergey M․	19dbaeece3	Remove _sort_formats from _extract_*_formats methods Now _sort_formats should be called explicitly. _sort_formats has been added to all the necessary places in code. Closes #8051	2016-03-27 07:03:08 +06:00
Yen Chi Hsuan	395fd4b08a	[twitter] Handle another form of embedded Vine Fixes #8996	2016-03-27 04:36:02 +08:00
Sergey M․	8018028d0f	[pluralsight] Extract chapter metadata (Closes #8993 )	2016-03-27 02:10:52 +06:00
Sergey M․	00322ad4fd	[lynda] Extract chapter metadata (#8993 )	2016-03-27 02:00:36 +06:00
Sergey M․	4cf3489c6e	[vevo] Update videoservice API URL (Closes #8900 )	2016-03-27 01:11:11 +06:00
Sergey M․	b24ab3e341	[udemy] Improve paid course detection	2016-03-27 00:09:12 +06:00
Sergey M․	af4116f4f0	[udemy] Improve format_id	2016-03-27 00:02:52 +06:00
Sergey M․	f973e5d54e	[udemy] Drop outputs' formats Always results in 403	2016-03-26 23:55:07 +06:00
Sergey M․	62f55aa68a	[udemy] Add outputs metadata to view_html formats	2016-03-26 23:54:12 +06:00
Sergey M․	02d7634d24	[udemy] Fix outputs' formats format_id	2016-03-26 23:43:25 +06:00
Sergey M․	48dce58ca9	[udemy] Use custom sorting	2016-03-26 23:42:46 +06:00
Sergey M․	efcba804f6	[udemy] Extract formats from view_html (Closes #8979 )	2016-03-26 23:42:34 +06:00
Sergey M․	6dee688e6d	[youtube:playlistsbase] Restrict playlist regex (Closes #8986 )	2016-03-26 20:42:18 +06:00
Sergey M․	eedb7ba536	[YoutubeDL] Sort imports	2016-03-26 19:40:33 +06:00
Sergey M․	dcf77cf1a7	[YoutubeDL] Sanitize final URLs (Closes #8991 )	2016-03-26 19:37:41 +06:00
Sergey M․	17bcc626bf	[utils] Extract sanitize_url routine	2016-03-26 19:33:57 +06:00
Sergey M․	b5a5bbf376	[mailru] Extend _VALID_URL (Closes #8990 )	2016-03-26 19:15:32 +06:00
Yen Chi Hsuan	e68d3a010f	[twitter] Fix extraction (closes #8966 ) HLS and DASH formats are no longer appeared in test cases. I keep them for fear of triggering new errors.	2016-03-26 18:34:51 +08:00
Yen Chi Hsuan	d10fe8358c	[generic] Add a test case for brightcove embed Closes #8862	2016-03-26 18:30:43 +08:00
Yen Chi Hsuan	d6c340cae5	[brightcove] Extract more formats (#8862 )	2016-03-26 18:21:07 +08:00
Yen Chi Hsuan	5964b598ff	[brightcove] Support alternative BrightcoveExperience layout The full URL lays in the `data` attribute of <object> (#8862)	2016-03-26 17:47:32 +08:00