1
0
mirror of https://github.com/ytdl-org/youtube-dl.git synced 2024-11-30 18:34:36 +01:00

Merge branch 'master' into master

This commit is contained in:
LangerJan 2020-01-13 12:18:28 +01:00 committed by GitHub
commit cd7602aabc
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
376 changed files with 18183 additions and 12745 deletions

View File

@ -1,61 +0,0 @@
## Please follow the guide below
- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
- Put an `x` into all the boxes [ ] relevant to your *issue* (like this: `[x]`)
- Use the *Preview* tab to see what your issue will actually look like
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2019.01.16*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2019.01.16**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
- [ ] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser
### What is the purpose of your *issue*?
- [ ] Bug report (encountered problems with youtube-dl)
- [ ] Site support request (request for adding support for a new site)
- [ ] Feature request (request for a new functionality)
- [ ] Question
- [ ] Other
---
### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
---
### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl -v <your command line>`), copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
```
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2019.01.16
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
<end of log>
```
---
### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/rg3/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
---
### Description of your *issue*, suggested solution and other information
Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/rg3/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
If work on your *issue* requires account credentials please provide them or explain how one can obtain them.

63
.github/ISSUE_TEMPLATE/1_broken_site.md vendored Normal file
View File

@ -0,0 +1,63 @@
---
name: Broken site support
about: Report broken or misfunctioning site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.01.01. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running youtube-dl version **2020.01.01**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2020.01.01
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@ -0,0 +1,54 @@
---
name: Site support request
about: Request support for a new site
title: ''
labels: 'site-support-request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.01.01. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running youtube-dl version **2020.01.01**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones
## Example URLs
<!--
Provide all kinds of example URLs support for which should be included. Replace following example URLs by yours.
-->
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
## Description
<!--
Provide any additional information.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@ -0,0 +1,37 @@
---
name: Site feature request
about: Request a new functionality for a site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.01.01. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running youtube-dl version **2020.01.01**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones
## Description
<!--
Provide an explanation of your site feature request in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

65
.github/ISSUE_TEMPLATE/4_bug_report.md vendored Normal file
View File

@ -0,0 +1,65 @@
---
name: Bug report
about: Report a bug unrelated to any particular site or extractor
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.01.01. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Read bugs section in FAQ: http://yt-dl.org/reporting
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running youtube-dl version **2020.01.01**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones
- [ ] I've read bugs section in FAQ
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2020.01.01
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@ -0,0 +1,38 @@
---
name: Feature request
about: Request a new functionality unrelated to any particular site or extractor
title: ''
labels: 'request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.01.01. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a feature request
- [ ] I've verified that I'm running youtube-dl version **2020.01.01**
- [ ] I've searched the bugtracker for similar feature requests including closed ones
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

38
.github/ISSUE_TEMPLATE/6_question.md vendored Normal file
View File

@ -0,0 +1,38 @@
---
name: Ask question
about: Ask youtube-dl related question
title: ''
labels: 'question'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- Look through the README (http://yt-dl.org/readme) and FAQ (http://yt-dl.org/faq) for similar questions
- Search the bugtracker for similar questions: http://yt-dl.org/search-issues
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm asking a question
- [ ] I've looked through the README and FAQ for similar questions
- [ ] I've searched the bugtracker for similar questions including closed ones
## Question
<!--
Ask your question in an arbitrary form. Please make sure it's worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient.
-->
WRITE QUESTION HERE

View File

@ -1,61 +0,0 @@
## Please follow the guide below
- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
- Put an `x` into all the boxes [ ] relevant to your *issue* (like this: `[x]`)
- Use the *Preview* tab to see what your issue will actually look like
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *%(version)s*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **%(version)s**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
- [ ] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser
### What is the purpose of your *issue*?
- [ ] Bug report (encountered problems with youtube-dl)
- [ ] Site support request (request for adding support for a new site)
- [ ] Feature request (request for a new functionality)
- [ ] Question
- [ ] Other
---
### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
---
### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl -v <your command line>`), copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
```
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version %(version)s
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
<end of log>
```
---
### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/rg3/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
---
### Description of your *issue*, suggested solution and other information
Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/rg3/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
If work on your *issue* requires account credentials please provide them or explain how one can obtain them.

View File

@ -0,0 +1,63 @@
---
name: Broken site support
about: Report broken or misfunctioning site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version %(version)s
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@ -0,0 +1,54 @@
---
name: Site support request
about: Request support for a new site
title: ''
labels: 'site-support-request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones
## Example URLs
<!--
Provide all kinds of example URLs support for which should be included. Replace following example URLs by yours.
-->
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
## Description
<!--
Provide any additional information.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@ -0,0 +1,37 @@
---
name: Site feature request
about: Request a new functionality for a site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones
## Description
<!--
Provide an explanation of your site feature request in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

View File

@ -0,0 +1,65 @@
---
name: Bug report
about: Report a bug unrelated to any particular site or extractor
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Read bugs section in FAQ: http://yt-dl.org/reporting
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones
- [ ] I've read bugs section in FAQ
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version %(version)s
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@ -0,0 +1,38 @@
---
name: Feature request
about: Request a new functionality unrelated to any particular site or extractor
title: ''
labels: 'request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a feature request
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've searched the bugtracker for similar feature requests including closed ones
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

View File

@ -7,8 +7,8 @@
--- ---
### Before submitting a *pull request* make sure you have: ### Before submitting a *pull request* make sure you have:
- [ ] At least skimmed through [adding new extractor tutorial](https://github.com/rg3/youtube-dl#adding-support-for-a-new-site) and [youtube-dl coding conventions](https://github.com/rg3/youtube-dl#youtube-dl-coding-conventions) sections - [ ] At least skimmed through [adding new extractor tutorial](https://github.com/ytdl-org/youtube-dl#adding-support-for-a-new-site) and [youtube-dl coding conventions](https://github.com/ytdl-org/youtube-dl#youtube-dl-coding-conventions) sections
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?q=is%3Apr&type=Issues) the bugtracker for similar pull requests - [ ] [Searched](https://github.com/ytdl-org/youtube-dl/search?q=is%3Apr&type=Issues) the bugtracker for similar pull requests
- [ ] Checked the code with [flake8](https://pypi.python.org/pypi/flake8) - [ ] Checked the code with [flake8](https://pypi.python.org/pypi/flake8)
### In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under [Unlicense](http://unlicense.org/). Check one of the following options: ### In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under [Unlicense](http://unlicense.org/). Check one of the following options:

View File

@ -9,7 +9,7 @@ python:
- "3.6" - "3.6"
- "pypy" - "pypy"
- "pypy3" - "pypy3"
sudo: false dist: trusty
env: env:
- YTDL_TEST_SET=core - YTDL_TEST_SET=core
- YTDL_TEST_SET=download - YTDL_TEST_SET=download
@ -21,6 +21,12 @@ matrix:
- python: 3.7 - python: 3.7
dist: xenial dist: xenial
env: YTDL_TEST_SET=download env: YTDL_TEST_SET=download
- python: 3.8
dist: xenial
env: YTDL_TEST_SET=core
- python: 3.8
dist: xenial
env: YTDL_TEST_SET=download
- python: 3.8-dev - python: 3.8-dev
dist: xenial dist: xenial
env: YTDL_TEST_SET=core env: YTDL_TEST_SET=core

View File

@ -42,11 +42,11 @@ Before reporting any issue, type `youtube-dl -U`. This should report that you're
### Is the issue already documented? ### Is the issue already documented?
Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or browse the [GitHub Issues](https://github.com/rg3/youtube-dl/search?type=Issues) of this repository. If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity. Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or browse the [GitHub Issues](https://github.com/ytdl-org/youtube-dl/search?type=Issues) of this repository. If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
### Why are existing options not enough? ### Why are existing options not enough?
Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/rg3/youtube-dl/blob/master/README.md#options). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem. Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/ytdl-org/youtube-dl/blob/master/README.md#options). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
### Is there enough context in your bug report? ### Is there enough context in your bug report?
@ -70,7 +70,7 @@ It may sound strange, but some bug reports we receive are completely unrelated t
# DEVELOPER INSTRUCTIONS # DEVELOPER INSTRUCTIONS
Most users do not need to build youtube-dl and can [download the builds](https://rg3.github.io/youtube-dl/download.html) or get them from their distribution. Most users do not need to build youtube-dl and can [download the builds](https://ytdl-org.github.io/youtube-dl/download.html) or get them from their distribution.
To run youtube-dl as a developer, you don't need to build anything either. Simply execute To run youtube-dl as a developer, you don't need to build anything either. Simply execute
@ -98,7 +98,7 @@ If you want to add support for a new site, first of all **make sure** this site
After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`): After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`):
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork) 1. [Fork this repository](https://github.com/ytdl-org/youtube-dl/fork)
2. Check out the source code with: 2. Check out the source code with:
git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git
@ -150,9 +150,9 @@ After you have ensured this site is distributing its content legally, you can fo
# TODO more properties (see youtube_dl/extractor/common.py) # TODO more properties (see youtube_dl/extractor/common.py)
} }
``` ```
5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/extractors.py). 5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in. 6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in.
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303). Add tests and code for as many as you want. 7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303). Add tests and code for as many as you want.
8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](http://flake8.pycqa.org/en/latest/index.html#quickstart): 8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](http://flake8.pycqa.org/en/latest/index.html#quickstart):
$ flake8 youtube_dl/extractor/yourextractor.py $ flake8 youtube_dl/extractor/yourextractor.py
@ -177,7 +177,7 @@ Extractors are very fragile by nature since they depend on the layout of the sou
### Mandatory and optional metafields ### Mandatory and optional metafields
For extraction to work youtube-dl relies on metadata your extractor extracts and provides to youtube-dl expressed by an [information dictionary](https://github.com/rg3/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303) or simply *info dict*. Only the following meta fields in the *info dict* are considered mandatory for a successful extraction process by youtube-dl: For extraction to work youtube-dl relies on metadata your extractor extracts and provides to youtube-dl expressed by an [information dictionary](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303) or simply *info dict*. Only the following meta fields in the *info dict* are considered mandatory for a successful extraction process by youtube-dl:
- `id` (media identifier) - `id` (media identifier)
- `title` (media title) - `title` (media title)
@ -185,7 +185,7 @@ For extraction to work youtube-dl relies on metadata your extractor extracts and
In fact only the last option is technically mandatory (i.e. if you can't figure out the download location of the media the extraction does not make any sense). But by convention youtube-dl also treats `id` and `title` as mandatory. Thus the aforementioned metafields are the critical data that the extraction does not make any sense without and if any of them fail to be extracted then the extractor is considered completely broken. In fact only the last option is technically mandatory (i.e. if you can't figure out the download location of the media the extraction does not make any sense). But by convention youtube-dl also treats `id` and `title` as mandatory. Thus the aforementioned metafields are the critical data that the extraction does not make any sense without and if any of them fail to be extracted then the extractor is considered completely broken.
[Any field](https://github.com/rg3/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L188-L303) apart from the aforementioned ones are considered **optional**. That means that extraction should be **tolerant** to situations when sources for these fields can potentially be unavailable (even if they are always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields. [Any field](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L188-L303) apart from the aforementioned ones are considered **optional**. That means that extraction should be **tolerant** to situations when sources for these fields can potentially be unavailable (even if they are always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields.
#### Example #### Example
@ -339,15 +339,83 @@ Incorrect:
'PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4' 'PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4'
``` ```
### Use safe conversion functions ### Inline values
Wrap all extracted numeric data into safe functions from [`youtube_dl/utils.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/utils.py): `int_or_none`, `float_or_none`. Use them for string to number conversions as well. Extracting variables is acceptable for reducing code duplication and improving readability of complex expressions. However, you should avoid extracting variables used only once and moving them to opposite parts of the extractor file, which makes reading the linear flow difficult.
#### Example
Correct:
```python
title = self._html_search_regex(r'<title>([^<]+)</title>', webpage, 'title')
```
Incorrect:
```python
TITLE_RE = r'<title>([^<]+)</title>'
# ...some lines of code...
title = self._html_search_regex(TITLE_RE, webpage, 'title')
```
### Collapse fallbacks
Multiple fallback values can quickly become unwieldy. Collapse multiple fallback values into a single expression via a list of patterns.
#### Example
Good:
```python
description = self._html_search_meta(
['og:description', 'description', 'twitter:description'],
webpage, 'description', default=None)
```
Unwieldy:
```python
description = (
self._og_search_description(webpage, default=None)
or self._html_search_meta('description', webpage, default=None)
or self._html_search_meta('twitter:description', webpage, default=None))
```
Methods supporting list of patterns are: `_search_regex`, `_html_search_regex`, `_og_search_property`, `_html_search_meta`.
### Trailing parentheses
Always move trailing parentheses after the last argument.
#### Example
Correct:
```python
lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
list)
```
Incorrect:
```python
lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
list,
)
```
### Use convenience conversion and parsing functions
Wrap all extracted numeric data into safe functions from [`youtube_dl/utils.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/utils.py): `int_or_none`, `float_or_none`. Use them for string to number conversions as well.
Use `url_or_none` for safe URL processing. Use `url_or_none` for safe URL processing.
Use `try_get` for safe metadata extraction from parsed JSON. Use `try_get` for safe metadata extraction from parsed JSON.
Explore [`youtube_dl/utils.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/utils.py) for more useful convenience functions. Use `unified_strdate` for uniform `upload_date` or any `YYYYMMDD` meta field extraction, `unified_timestamp` for uniform `timestamp` extraction, `parse_filesize` for `filesize` extraction, `parse_count` for count meta fields extraction, `parse_resolution`, `parse_duration` for `duration` extraction, `parse_age_limit` for `age_limit` extraction.
Explore [`youtube_dl/utils.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/utils.py) for more useful convenience functions.
#### More examples #### More examples

948
ChangeLog
View File

@ -1,3 +1,949 @@
version 2020.01.01
Extractors
* [brightcove] Invalidate policy key cache on failing requests
* [pornhub] Improve locked videos detection (#22449, #22780)
+ [pornhub] Add support for m3u8 formats
* [pornhub] Fix extraction (#22749, #23082)
* [brightcove] Update policy key on failing requests
* [spankbang] Improve removed video detection (#23423)
* [spankbang] Fix extraction (#23307, #23423, #23444)
* [soundcloud] Automatically update client id on failing requests
* [prosiebensat1] Improve geo restriction handling (#23571)
* [brightcove] Cache brightcove player policy keys
* [teachable] Fail with error message if no video URL found
* [teachable] Improve locked lessons detection (#23528)
+ [scrippsnetworks] Add support for Scripps Networks sites (#19857, #22981)
* [mitele] Fix extraction (#21354, #23456)
* [soundcloud] Update client id (#23516)
* [mailru] Relax URL regular expressions (#23509)
version 2019.12.25
Core
* [utils] Improve str_to_int
+ [downloader/hls] Add ability to override AES decryption key URL (#17521)
Extractors
* [mediaset] Fix parse formats (#23508)
+ [tv2dk:bornholm:play] Add support for play.tv2bornholm.dk (#23291)
+ [slideslive] Add support for url and vimeo service names (#23414)
* [slideslive] Fix extraction (#23413)
* [twitch:clips] Fix extraction (#23375)
+ [soundcloud] Add support for token protected embeds (#18954)
* [vk] Improve extraction
* Fix User Videos extraction (#23356)
* Extract all videos for lists with more than 1000 videos (#23356)
+ Add support for video albums (#14327, #14492)
- [kontrtube] Remove extractor
- [videopremium] Remove extractor
- [musicplayon] Remove extractor (#9225)
+ [ufctv] Add support for ufcfightpass.imgdge.com and
ufcfightpass.imggaming.com (#23343)
+ [twitch] Extract m3u8 formats frame rate (#23333)
+ [imggaming] Add support for playlists and extract subtitles
+ [ufcarabia] Add support for UFC Arabia (#23312)
* [ufctv] Fix extraction
* [yahoo] Fix gyao brightcove player id (#23303)
* [vzaar] Override AES decryption key URL (#17521)
+ [vzaar] Add support for AES HLS manifests (#17521, #23299)
* [nrl] Fix extraction
* [teachingchannel] Fix extraction
* [nintendo] Fix extraction and partially add support for Nintendo Direct
videos (#4592)
+ [ooyala] Add better fallback values for domain and streams variables
+ [youtube] Add support youtubekids.com (#23272)
* [tv2] Detect DRM protection
+ [tv2] Add support for katsomo.fi and mtv.fi (#10543)
* [tv2] Fix tv2.no article extraction
* [msn] Improve extraction
+ Add support for YouTube and NBCSports embeds
+ Add support for articles with multiple videos
* Improve AOL embed support
* Improve format extraction
* [abcotvs] Relax URL regular expression and improve metadata extraction
(#18014)
* [channel9] Reduce response size
* [adobetv] Improve extaction
* Use OnDemandPagedList for list extractors
* Reduce show extraction requests
* Extract original video format and subtitles
+ Add support for adobe tv embeds
version 2019.11.28
Core
+ [utils] Add generic caesar cipher and rot47
* [utils] Handle rd-suffixed day parts in unified_strdate (#23199)
Extractors
* [vimeo] Improve extraction
* Fix review extraction
* Fix ondemand extraction
* Make password protected player case as an expected error (#22896)
* Simplify channel based extractors code
- [openload] Remove extractor (#11999)
- [verystream] Remove extractor
- [streamango] Remove extractor (#15406)
* [dailymotion] Improve extraction
* Extract http formats included in m3u8 manifest
* Fix user extraction (#3553, #21415)
+ Add suport for User Authentication (#11491)
* Fix password protected videos extraction (#23176)
* Respect age limit option and family filter cookie value (#18437)
* Handle video url playlist query param
* Report allowed countries for geo-restricted videos
* [corus] Improve extraction
+ Add support for Series Plus, W Network, YTV, ABC Spark, disneychannel.com
and disneylachaine.ca (#20861)
+ Add support for self hosted videos (#22075)
* Detect DRM protection (#14910, #9164)
* [vivo] Fix extraction (#22328, #22279)
+ [bitchute] Extract upload date (#22990, #23193)
* [soundcloud] Update client id (#23214)
version 2019.11.22
Core
+ [extractor/common] Clean jwplayer description HTML tags
+ [extractor/common] Add data, headers and query to all major extract formats
methods
Extractors
* [chaturbate] Fix extraction (#23010, #23012)
+ [ntvru] Add support for non relative file URLs (#23140)
* [vk] Fix wall audio thumbnails extraction (#23135)
* [ivi] Fix format extraction (#21991)
- [comcarcoff] Remove extractor
+ [drtv] Add support for new URL schema (#23059)
+ [nexx] Add support for Multi Player JS Setup (#23052)
+ [teamcoco] Add support for new videos (#23054)
* [soundcloud] Check if the soundtrack has downloads left (#23045)
* [facebook] Fix posts video data extraction (#22473)
- [addanime] Remove extractor
- [minhateca] Remove extractor
- [daisuki] Remove extractor
* [seeker] Fix extraction
- [revision3] Remove extractors
* [twitch] Fix video comments URL (#18593, #15828)
* [twitter] Improve extraction
+ Add support for generic embeds (#22168)
* Always extract http formats for native videos (#14934)
+ Add support for Twitter Broadcasts (#21369)
+ Extract more metadata
* Improve VMap format extraction
* Unify extraction code for both twitter statuses and cards
+ [twitch] Add support for Clip embed URLs
* [lnkgo] Fix extraction (#16834)
* [mixcloud] Improve extraction
* Improve metadata extraction (#11721)
* Fix playlist extraction (#22378)
* Fix user mixes extraction (#15197, #17865)
+ [kinja] Add support for Kinja embeds (#5756, #11282, #22237, #22384)
* [onionstudios] Fix extraction
+ [hotstar] Pass Referer header to format requests (#22836)
* [dplay] Minimize response size
+ [patreon] Extract uploader_id and filesize
* [patreon] Minimize response size
* [roosterteeth] Fix login request (#16094, #22689)
version 2019.11.05
Extractors
+ [scte] Add support for learning.scte.org (#22975)
+ [msn] Add support for Vidible and AOL embeds (#22195, #22227)
* [myspass] Fix video URL extraction and improve metadata extraction (#22448)
* [jamendo] Improve extraction
* Fix album extraction (#18564)
* Improve metadata extraction (#18565, #21379)
* [mediaset] Relax URL guid matching (#18352)
+ [mediaset] Extract unprotected M3U and MPD manifests (#17204)
* [telegraaf] Fix extraction
+ [bellmedia] Add support for marilyn.ca videos (#22193)
* [stv] Fix extraction (#22928)
- [iconosquare] Remove extractor
- [keek] Remove extractor
- [gameone] Remove extractor (#21778)
- [flipagram] Remove extractor
- [bambuser] Remove extractor
* [wistia] Reduce embed extraction false positives
+ [wistia] Add support for inline embeds (#22931)
- [go90] Remove extractor
* [kakao] Remove raw request
+ [kakao] Extract format total bitrate
* [daum] Fix VOD and Clip extracton (#15015)
* [kakao] Improve extraction
+ Add support for embed URLs
+ Add support for Kakao Legacy vid based embed URLs
* Only extract fields used for extraction
* Strip description and extract tags
* [mixcloud] Fix cloudcast data extraction (#22821)
* [yahoo] Improve extraction
+ Add support for live streams (#3597, #3779, #22178)
* Bypass cookie consent page for european domains (#16948, #22576)
+ Add generic support for embeds (#20332)
* [tv2] Fix and improve extraction (#22787)
+ [tv2dk] Add support for TV2 DK sites
* [onet] Improve extraction …
+ Add support for onet100.vod.pl
+ Extract m3u8 formats
* Correct audio only format info
* [fox9] Fix extraction
version 2019.10.29
Core
* [utils] Actualize major IPv4 address blocks per country
Extractors
+ [go] Add support for abc.com and freeform.com (#22823, #22864)
+ [mtv] Add support for mtvjapan.com
* [mtv] Fix extraction for mtv.de (#22113)
* [videodetective] Fix extraction
* [internetvideoarchive] Fix extraction
* [nbcnews] Fix extraction (#12569, #12576, #21703, #21923)
- [hark] Remove extractor
- [tutv] Remove extractor
- [learnr] Remove extractor
- [macgamestore] Remove extractor
* [la7] Update Kaltura service URL (#22358)
* [thesun] Fix extraction (#16966)
- [makertv] Remove extractor
+ [tenplay] Add support for 10play.com.au (#21446)
* [soundcloud] Improve extraction
* Improve format extraction (#22123)
+ Extract uploader_id and uploader_url (#21916)
+ Extract all known thumbnails (#19071, #20659)
* Fix extration for private playlists (#20976)
+ Add support for playlist embeds (#20976)
* Skip preview formats (#22806)
* [dplay] Improve extraction
+ Add support for dplay.fi, dplay.jp and es.dplay.com (#16969)
* Fix it.dplay.com extraction (#22826)
+ Extract creator, tags and thumbnails
* Handle playback API call errors
+ [discoverynetworks] Add support for dplay.co.uk
* [vk] Improve extraction
+ Add support for Odnoklassniki embeds
+ Extract more videos from user lists (#4470)
+ Fix wall post audio extraction (#18332)
* Improve error detection (#22568)
+ [odnoklassniki] Add support for embeds
* [puhutv] Improve extraction
* Fix subtitles extraction
* Transform HLS URLs to HTTP URLs
* Improve metadata extraction
* [ceskatelevize] Skip DRM media
+ [facebook] Extract subtitles (#22777)
* [globo] Handle alternative hash signing method
version 2019.10.22
Core
* [utils] Improve subtitles_filename (#22753)
Extractors
* [facebook] Bypass download rate limits (#21018)
+ [contv] Add support for contv.com
- [viewster] Remove extractor
* [xfileshare] Improve extractor (#17032, #17906, #18237, #18239)
* Update the list of domains
+ Add support for aa-encoded video data
* Improve jwplayer format extraction
+ Add support for Clappr sources
* [mangomolo] Fix video format extraction and add support for player URLs
* [audioboom] Improve metadata extraction
* [twitch] Update VOD URL matching (#22395, #22727)
- [mit] Remove support for video.mit.edu (#22403)
- [servingsys] Remove extractor (#22639)
* [dumpert] Fix extraction (#22428, #22564)
* [atresplayer] Fix extraction (#16277, #16716)
version 2019.10.16
Core
* [extractor/common] Make _is_valid_url more relaxed
Extractors
* [vimeo] Improve album videos id extraction (#22599)
+ [globo] Extract subtitles (#22713)
* [bokecc] Improve player params extraction (#22638)
* [nexx] Handle result list (#22666)
* [vimeo] Fix VHX embed extraction
* [nbc] Switch to graphql API (#18581, #22693, #22701)
- [vessel] Remove extractor
- [promptfile] Remove extractor (#6239)
* [kaltura] Fix service URL extraction (#22658)
* [kaltura] Fix embed info strip (#22658)
* [globo] Fix format extraction (#20319)
* [redtube] Improve metadata extraction (#22492, #22615)
* [pornhub:uservideos:upload] Fix extraction (#22619)
+ [telequebec:squat] Add support for squat.telequebec.tv (#18503)
- [wimp] Remove extractor (#22088, #22091)
+ [gfycat] Extend URL regular expression (#22225)
+ [chaturbate] Extend URL regular expression (#22309)
* [peertube] Update instances (#22414)
+ [telequebec] Add support for coucou.telequebec.tv (#22482)
+ [xvideos] Extend URL regular expression (#22471)
- [youtube] Remove support for invidious.enkirton.net (#22543)
+ [openload] Add support for oload.monster (#22592)
* [nrktv:seriebase] Fix extraction (#22596)
+ [youtube] Add support for yt.lelux.fi (#22597)
* [orf:tvthek] Make manifest requests non fatal (#22578)
* [teachable] Skip login when already logged in (#22572)
* [viewlift] Improve extraction (#22545)
* [nonktube] Fix extraction (#22544)
version 2019.09.28
Core
* [YoutubeDL] Honour all --get-* options with --flat-playlist (#22493)
Extractors
* [vk] Fix extraction (#22522)
* [heise] Fix kaltura embeds extraction (#22514)
* [ted] Check for resources validity and extract subtitled downloads (#22513)
+ [youtube] Add support for
owxfohz4kjyv25fvlqilyxast7inivgiktls3th44jhk3ej3i7ya.b32.i2p (#22292)
+ [nhk] Add support for clips
* [nhk] Fix video extraction (#22249, #22353)
* [byutv] Fix extraction (#22070)
+ [openload] Add support for oload.online (#22304)
+ [youtube] Add support for invidious.drycat.fr (#22451)
* [jwplatfom] Do not match video URLs (#20596, #22148)
* [youtube:playlist] Unescape playlist uploader (#22483)
+ [bilibili] Add support audio albums and songs (#21094)
+ [instagram] Add support for tv URLs
+ [mixcloud] Allow uppercase letters in format URLs (#19280)
* [brightcove] Delegate all supported legacy URLs to new extractor (#11523,
#12842, #13912, #15669, #16303)
* [hotstar] Use native HLS downloader by default
+ [hotstar] Extract more formats (#22323)
* [9now] Fix extraction (#22361)
* [zdf] Bypass geo restriction
+ [tv4] Extract series metadata
* [tv4] Fix extraction (#22443)
version 2019.09.12.1
Extractors
* [youtube] Remove quality and tbr for itag 43 (#22372)
version 2019.09.12
Extractors
* [youtube] Quick extraction tempfix (#22367, #22163)
version 2019.09.01
Core
+ [extractor/generic] Add support for squarespace embeds (#21294, #21802,
#21859)
+ [downloader/external] Respect mtime option for aria2c (#22242)
Extractors
+ [xhamster:user] Add support for user pages (#16330, #18454)
+ [xhamster] Add support for more domains
+ [verystream] Add support for woof.tube (#22217)
+ [dailymotion] Add support for lequipe.fr (#21328, #22152)
+ [openload] Add support for oload.vip (#22205)
+ [bbccouk] Extend URL regular expression (#19200)
+ [youtube] Add support for invidious.nixnet.xyz and yt.elukerio.org (#22223)
* [safari] Fix authentication (#22161, #22184)
* [usanetwork] Fix extraction (#22105)
+ [einthusan] Add support for einthusan.ca (#22171)
* [youtube] Improve unavailable message extraction (#22117)
+ [piksel] Extract subtitles (#20506)
version 2019.08.13
Core
* [downloader/fragment] Fix ETA calculation of resumed download (#21992)
* [YoutubeDL] Check annotations availability (#18582)
Extractors
* [youtube:playlist] Improve flat extraction (#21927)
* [youtube] Fix annotations extraction (#22045)
+ [discovery] Extract series meta field (#21808)
* [youtube] Improve error detection (#16445)
* [vimeo] Fix album extraction (#1933, #15704, #15855, #18967, #21986)
+ [roosterteeth] Add support for watch URLs
* [discovery] Limit video data by show slug (#21980)
version 2019.08.02
Extractors
+ [tvigle] Add support for HLS and DASH formats (#21967)
* [tvigle] Fix extraction (#21967)
+ [yandexvideo] Add support for DASH formats (#21971)
* [discovery] Use API call for video data extraction (#21808)
+ [mgtv] Extract format_note (#21881)
* [tvn24] Fix metadata extraction (#21833, #21834)
* [dlive] Relax URL regular expression (#21909)
+ [openload] Add support for oload.best (#21913)
* [youtube] Improve metadata extraction for age gate content (#21943)
version 2019.07.30
Extractors
* [youtube] Fix and improve title and description extraction (#21934)
version 2019.07.27
Extractors
+ [yahoo:japannews] Add support for yahoo.co.jp (#21698, #21265)
+ [discovery] Add support go.discovery.com URLs
* [youtube:playlist] Relax video regular expression (#21844)
* [generic] Restrict --default-search schemeless URLs detection pattern
(#21842)
* [vrv] Fix CMS signing query extraction (#21809)
version 2019.07.16
Extractors
+ [asiancrush] Add support for yuyutv.com, midnightpulp.com and cocoro.tv
(#21281, #21290)
* [kaltura] Check source format URL (#21290)
* [ctsnews] Fix YouTube embeds extraction (#21678)
+ [einthusan] Add support for einthusan.com (#21748, #21775)
+ [youtube] Add support for invidious.mastodon.host (#21777)
+ [gfycat] Extend URL regular expression (#21779, #21780)
* [youtube] Restrict is_live extraction (#21782)
version 2019.07.14
Extractors
* [porn91] Fix extraction (#21312)
+ [yandexmusic] Extract track number and disk number (#21421)
+ [yandexmusic] Add support for multi disk albums (#21420, #21421)
* [lynda] Handle missing subtitles (#20490, #20513)
+ [youtube] Add more invidious instances to URL regular expression (#21694)
* [twitter] Improve uploader id extraction (#21705)
* [spankbang] Fix and improve metadata extraction
* [spankbang] Fix extraction (#21763, #21764)
+ [dlive] Add support for dlive.tv (#18080)
+ [livejournal] Add support for livejournal.com (#21526)
* [roosterteeth] Fix free episode extraction (#16094)
* [dbtv] Fix extraction
* [bellator] Fix extraction
- [rudo] Remove extractor (#18430, #18474)
* [facebook] Fallback to twitter:image meta for thumbnail extraction (#21224)
* [bleacherreport] Fix Bleacher Report CMS extraction
* [espn] Fix fivethirtyeight.com extraction
* [5tv] Relax video URL regular expression and support https URLs
* [youtube] Fix is_live extraction (#21734)
* [youtube] Fix authentication (#11270)
version 2019.07.12
Core
+ [adobepass] Add support for AT&T U-verse (mso ATT) (#13938, #21016)
Extractors
+ [mgtv] Pass Referer HTTP header for format URLs (#21726)
+ [beeg] Add support for api/v6 v2 URLs without t argument (#21701)
* [voxmedia:volume] Improvevox embed extraction (#16846)
* [funnyordie] Move extraction to VoxMedia extractor (#16846)
* [gameinformer] Fix extraction (#8895, #15363, #17206)
* [funk] Fix extraction (#17915)
* [packtpub] Relax lesson URL regular expression (#21695)
* [packtpub] Fix extraction (#21268)
* [philharmoniedeparis] Relax URL regular expression (#21672)
* [peertube] Detect embed URLs in generic extraction (#21666)
* [mixer:vod] Relax URL regular expression (#21657, #21658)
+ [lecturio] Add support id based URLs (#21630)
+ [go] Add site info for disneynow (#21613)
* [ted] Restrict info regular expression (#21631)
* [twitch:vod] Actualize m3u8 URL (#21538, #21607)
* [vzaar] Fix videos with empty title (#21606)
* [tvland] Fix extraction (#21384)
* [arte] Clean extractor (#15583, #21614)
version 2019.07.02
Core
+ [utils] Introduce random_user_agent and use as default User-Agent (#21546)
Extractors
+ [vevo] Add support for embed.vevo.com URLs (#21565)
+ [openload] Add support for oload.biz (#21574)
* [xiami] Update API base URL (#21575)
* [yourporn] Fix extraction (#21585)
+ [acast] Add support for URLs with episode id (#21444)
+ [dailymotion] Add support for DM.player embeds
* [soundcloud] Update client id
version 2019.06.27
Extractors
+ [go] Add support for disneynow.com (#21528)
* [mixer:vod] Relax URL regular expression (#21531, #21536)
* [drtv] Relax URL regular expression
* [fusion] Fix extraction (#17775, #21269)
- [nfb] Remove extractor (#21518)
+ [beeg] Add support for api/v6 v2 URLs (#21511)
+ [brightcove:new] Add support for playlists (#21331)
+ [openload] Add support for oload.life (#21495)
* [vimeo:channel,group] Make title extraction non fatal
* [vimeo:likes] Implement extrator in terms of channel extractor (#21493)
+ [pornhub] Add support for more paged video sources
+ [pornhub] Add support for downloading single pages and search pages (#15570)
* [pornhub] Rework extractors (#11922, #16078, #17454, #17936)
+ [youtube] Add another signature function pattern
* [tf1] Fix extraction (#21365, #21372)
* [crunchyroll] Move Accept-Language workaround to video extractor since
it causes playlists not to list any videos
* [crunchyroll:playlist] Fix and relax title extraction (#21291, #21443)
version 2019.06.21
Core
* [utils] Restrict parse_codecs and add theora as known vcodec (#21381)
Extractors
* [youtube] Update signature function patterns (#21469, #21476)
* [youtube] Make --write-annotations non fatal (#21452)
+ [sixplay] Add support for rtlmost.hu (#21405)
* [youtube] Hardcode codec metadata for av01 video only formats (#21381)
* [toutv] Update client key (#21370)
+ [biqle] Add support for new embed domain
* [cbs] Improve DRM protected videos detection (#21339)
version 2019.06.08
Core
* [downloader/common] Improve rate limit (#21301)
* [utils] Improve strip_or_none
* [extractor/common] Strip src attribute for HTML5 entries code (#18485,
#21169)
Extractors
* [ted] Fix playlist extraction (#20844, #21032)
* [vlive:playlist] Fix video extraction when no playlist is found (#20590)
+ [vlive] Add CH+ support (#16887, #21209)
+ [openload] Add support for oload.website (#21329)
+ [tvnow] Extract HD formats (#21201)
+ [redbulltv] Add support for rrn:content URLs (#21297)
* [youtube] Fix average rating extraction (#21304)
+ [bitchute] Extract HTML5 formats (#21306)
* [cbsnews] Fix extraction (#9659, #15397)
* [vvvvid] Relax URL regular expression (#21299)
+ [prosiebensat1] Add support for new API (#21272)
+ [vrv] Extract adaptive_hls formats (#21243)
* [viki] Switch to HTTPS (#21001)
* [LiveLeak] Check if the original videos exist (#21206, #21208)
* [rtp] Fix extraction (#15099)
* [youtube] Improve DRM protected videos detection (#1774)
+ [srgssrplay] Add support for popupvideoplayer URLs (#21155)
+ [24video] Add support for porno.24video.net (#21194)
+ [24video] Add support for 24video.site (#21193)
- [pornflip] Remove extractor
- [criterion] Remove extractor (#21195)
* [pornhub] Use HTTPS (#21061)
* [bitchute] Fix uploader extraction (#21076)
* [streamcloud] Reduce waiting time to 6 seconds (#21092)
- [novamov] Remove extractors (#21077)
+ [openload] Add support for oload.press (#21135)
* [vivo] Fix extraction (#18906, #19217)
version 2019.05.20
Core
+ [extractor/common] Move workaround for applying first Set-Cookie header
into a separate _apply_first_set_cookie_header method
Extractors
* [safari] Fix authentication (#21090)
* [vk] Use _apply_first_set_cookie_header
* [vrt] Fix extraction (#20527)
+ [canvas] Add support for vrtnieuws and sporza site ids and extract
AES HLS formats
+ [vrv] Extract captions (#19238)
* [tele5] Improve video id extraction
* [tele5] Relax URL regular expression (#21020, #21063)
* [svtplay] Update API URL (#21075)
+ [yahoo:gyao] Add X-User-Agent header to dam proxy requests (#21071)
version 2019.05.11
Core
* [utils] Transliterate "þ" as "th" (#20897)
Extractors
+ [cloudflarestream] Add support for videodelivery.net (#21049)
+ [byutv] Add support for DVR videos (#20574, #20676)
+ [gfycat] Add support for URLs with tags (#20696, #20731)
+ [openload] Add support for verystream.com (#20701, #20967)
* [youtube] Use sp field value for signature field name (#18841, #18927,
#21028)
+ [yahoo:gyao] Extend URL regular expression (#21008)
* [youtube] Fix channel id extraction (#20982, #21003)
+ [sky] Add support for news.sky.com (#13055)
+ [youtube:entrylistbase] Retry on 5xx HTTP errors (#20965)
+ [francetvinfo] Extend video id extraction (#20619, #20740)
* [4tube] Update token hosts (#20918)
* [hotstar] Move to API v2 (#20931)
* [fox] Fix API error handling under python 2 (#20925)
+ [redbulltv] Extend URL regular expression (#20922)
version 2019.04.30
Extractors
* [openload] Use real Chrome versions (#20902)
- [youtube] Remove info el for get_video_info request
* [youtube] Improve extraction robustness
- [dramafever] Remove extractor (#20868)
* [adn] Fix subtitle extraction (#12724)
+ [ccc] Extract creator (#20355)
+ [ccc:playlist] Add support for media.ccc.de playlists (#14601, #20355)
+ [sverigesradio] Add support for sverigesradio.se (#18635)
+ [cinemax] Add support for cinemax.com
* [sixplay] Try extracting non-DRM protected manifests (#20849)
+ [youtube] Extract Youtube Music Auto-generated metadata (#20599, #20742)
- [wrzuta] Remove extractor (#20684, #20801)
* [twitch] Prefer source format (#20850)
+ [twitcasting] Add support for private videos (#20843)
* [reddit] Validate thumbnail URL (#20030)
* [yandexmusic] Fix track URL extraction (#20820)
version 2019.04.24
Extractors
* [youtube] Fix extraction (#20758, #20759, #20761, #20762, #20764, #20766,
#20767, #20769, #20771, #20768, #20770)
* [toutv] Fix extraction and extract series info (#20757)
+ [vrv] Add support for movie listings (#19229)
+ [youtube] Print error when no data is available (#20737)
+ [soundcloud] Add support for new rendition and improve extraction (#20699)
+ [ooyala] Add support for geo verification proxy
+ [nrl] Add support for nrl.com (#15991)
+ [vimeo] Extract live archive source format (#19144)
+ [vimeo] Add support for live streams and improve info extraction (#19144)
+ [ntvcojp] Add support for cu.ntv.co.jp
+ [nhk] Extract RTMPT format
+ [nhk] Add support for audio URLs
+ [udemy] Add another course id extraction pattern (#20491)
+ [openload] Add support for oload.services (#20691)
+ [openload] Add support for openloed.co (#20691, #20693)
* [bravotv] Fix extraction (#19213)
version 2019.04.17
Extractors
* [openload] Randomize User-Agent (#20688)
+ [openload] Add support for oladblock domains (#20471)
* [adn] Fix subtitle extraction (#12724)
+ [aol] Add support for localized websites
+ [yahoo] Add support GYAO episode URLs
+ [yahoo] Add support for streaming.yahoo.co.jp (#5811, #7098)
+ [yahoo] Add support for gyao.yahoo.co.jp
* [aenetworks] Fix history topic extraction and extract more formats
+ [cbs] Extract smpte and vtt subtitles
+ [streamango] Add support for streamcherry.com (#20592)
+ [yourporn] Add support for sxyprn.com (#20646)
* [mgtv] Fix extraction (#20650)
* [linkedin:learning] Use urljoin for form action URL (#20431)
+ [gdc] Add support for kaltura embeds (#20575)
* [dispeak] Improve mp4 bitrate extraction
* [kaltura] Sanitize embed URLs
* [jwplatfom] Do not match manifest URLs (#20596)
* [aol] Restrict URL regular expression and improve format extraction
+ [tiktok] Add support for new URL schema (#20573)
+ [stv:player] Add support for player.stv.tv (#20586)
version 2019.04.07
Core
+ [downloader/external] Pass rtmp_conn to ffmpeg
Extractors
+ [ruutu] Add support for audio podcasts (#20473, #20545)
+ [xvideos] Extract all thumbnails (#20432)
+ [platzi] Add support for platzi.com (#20562)
* [dvtv] Fix extraction (#18514, #19174)
+ [vrv] Add basic support for individual movie links (#19229)
+ [bfi:player] Add support for player.bfi.org.uk (#19235)
* [hbo] Fix extraction and extract subtitles (#14629, #13709)
* [youtube] Extract srv[1-3] subtitle formats (#20566)
* [adultswim] Fix extraction (#18025)
* [teamcoco] Fix extraction and add suport for subdomains (#17099, #20339)
* [adn] Fix subtitle compatibility with ffmpeg
* [adn] Fix extraction and add support for positioning styles (#20549)
* [vk] Use unique video id (#17848)
* [newstube] Fix extraction
* [rtl2] Actualize extraction
+ [adobeconnect] Add support for adobeconnect.com (#20283)
+ [gaia] Add support for authentication (#14605)
+ [mediasite] Add support for dashed ids and named catalogs (#20531)
version 2019.04.01
Core
* [utils] Improve int_or_none and float_or_none (#20403)
* Check for valid --min-sleep-interval when --max-sleep-interval is specified
(#20435)
Extractors
+ [weibo] Extend URL regular expression (#20496)
+ [xhamster] Add support for xhamster.one (#20508)
+ [mediasite] Add support for catalogs (#20507)
+ [teamtreehouse] Add support for teamtreehouse.com (#9836)
+ [ina] Add support for audio URLs
* [ina] Improve extraction
* [cwtv] Fix episode number extraction (#20461)
* [npo] Improve DRM detection
+ [pornhub] Add support for DASH formats (#20403)
* [svtplay] Update API endpoint (#20430)
version 2019.03.18
Core
* [extractor/common] Improve HTML5 entries extraction
+ [utils] Introduce parse_bitrate
* [update] Hide update URLs behind redirect
* [extractor/common] Fix url meta field for unfragmented DASH formats (#20346)
Extractors
+ [yandexvideo] Add extractor
* [openload] Improve embed detection
+ [corus] Add support for bigbrothercanada.ca (#20357)
+ [orf:radio] Extract series (#20012)
+ [cbc:watch] Add support for gem.cbc.ca (#20251, #20359)
- [anysex] Remove extractor (#19279)
+ [ciscolive] Add support for new URL schema (#20320, #20351)
+ [youtube] Add support for invidiou.sh (#20309)
- [anitube] Remove extractor (#20334)
- [ruleporn] Remove extractor (#15344, #20324)
* [npr] Fix extraction (#10793, #13440)
* [biqle] Fix extraction (#11471, #15313)
* [viddler] Modernize
* [moevideo] Fix extraction
* [primesharetv] Remove extractor
* [hypem] Modernize and extract more metadata (#15320)
* [veoh] Fix extraction
* [escapist] Modernize
- [videomega] Remove extractor (#10108)
+ [beeg] Add support for beeg.porn (#20306)
* [vimeo:review] Improve config url extraction and extract original format
(#20305)
* [fox] Detect geo restriction and authentication errors (#20208)
version 2019.03.09
Core
* [extractor/common] Use compat_etree_Element
+ [compat] Introduce compat_etree_Element
* [extractor/common] Fallback url to base URL for DASH formats
* [extractor/common] Do not fail on invalid data while parsing F4M manifest
in non fatal mode
* [extractor/common] Return MPD manifest as format's url meta field (#20242)
* [utils] Strip #HttpOnly_ prefix from cookies files (#20219)
Extractors
* [francetv:site] Relax video id regular expression (#20268)
* [toutv] Detect invalid login error
* [toutv] Fix authentication (#20261)
+ [urplay] Extract timestamp (#20235)
+ [openload] Add support for oload.space (#20246)
* [facebook] Improve uploader extraction (#20250)
* [bbc] Use compat_etree_Element
* [crunchyroll] Use compat_etree_Element
* [npo] Improve ISM extraction
* [rai] Improve extraction (#20253)
* [paramountnetwork] Fix mgid extraction (#20241)
* [libsyn] Improve extraction (#20229)
+ [youtube] Add more invidious instances to URL regular expression (#20228)
* [spankbang] Fix extraction (#20023)
* [espn] Extend URL regular expression (#20013)
* [sixplay] Handle videos with empty assets (#20016)
+ [vimeo] Add support for Vimeo Pro portfolio protected videos (#20070)
version 2019.03.01
Core
+ [downloader/external] Add support for rate limit and retries for wget
* [downloader/external] Fix infinite retries for curl (#19303)
Extractors
* [npo] Fix extraction (#20084)
* [francetv:site] Extend video id regex (#20029, #20071)
+ [periscope] Extract width and height (#20015)
* [servus] Fix extraction (#19297)
* [bbccouk] Make subtitles non fatal (#19651)
* [metacafe] Fix family filter bypass (#19287)
version 2019.02.18
Extractors
* [tvp:website] Fix and improve extraction
+ [tvp] Detect unavailable videos
* [tvp] Fix description extraction and make thumbnail optional
+ [linuxacademy] Add support for linuxacademy.com (#12207)
* [bilibili] Update keys (#19233)
* [udemy] Extend URL regular expressions (#14330, #15883)
* [udemy] Update User-Agent and detect captcha (#14713, #15839, #18126)
* [noovo] Fix extraction (#19230)
* [rai] Relax URL regular expression (#19232)
+ [vshare] Pass Referer to download request (#19205, #19221)
+ [openload] Add support for oload.live (#19222)
* [imgur] Use video id as title fallback (#18590)
+ [twitch] Add new source format detection approach (#19193)
* [tvplayhome] Fix video id extraction (#19190)
* [tvplayhome] Fix episode metadata extraction (#19190)
* [rutube:embed] Fix extraction (#19163)
+ [rutube:embed] Add support private videos (#19163)
+ [soundcloud] Extract more metadata
+ [trunews] Add support for trunews.com (#19153)
+ [linkedin:learning] Extract chapter_number and chapter_id (#19162)
version 2019.02.08
Core
* [utils] Improve JSON-LD regular expression (#18058)
* [YoutubeDL] Fallback to ie_key of matching extractor while making
download archive id when no explicit ie_key is provided (#19022)
Extractors
+ [malltv] Add support for mall.tv (#18058, #17856)
+ [spankbang:playlist] Add support for playlists (#19145)
* [spankbang] Extend URL regular expression
* [trutv] Fix extraction (#17336)
* [toutv] Fix authentication (#16398, #18700)
* [pornhub] Fix tags and categories extraction (#13720, #19135)
* [pornhd] Fix formats extraction
+ [pornhd] Extract like count (#19123, #19125)
* [radiocanada] Switch to the new media requests (#19115)
+ [teachable] Add support for courses.workitdaily.com (#18871)
- [vporn] Remove extractor (#16276)
+ [soundcloud:pagedplaylist] Add ie and title to entries (#19022, #19086)
+ [drtuber] Extract duration (#19078)
* [soundcloud] Fix paged playlists extraction, add support for albums and update client id
* [soundcloud] Update client id
* [drtv] Improve preference (#19079)
+ [openload] Add support for openload.pw and oload.pw (#18930)
+ [openload] Add support for oload.info (#19073)
* [crackle] Authorize media detail request (#16931)
version 2019.01.30.1
Core
* [postprocessor/ffmpeg] Fix avconv processing broken in #19025 (#19067)
version 2019.01.30
Core
* [postprocessor/ffmpeg] Do not copy Apple TV chapter tracks while embedding
subtitles (#19024, #19042)
* [postprocessor/ffmpeg] Disable "Last message repeated" messages (#19025)
Extractors
* [yourporn] Fix extraction and extract duration (#18815, #18852, #19061)
* [drtv] Improve extraction (#19039)
+ Add support for EncryptedUri videos
+ Extract more metadata
* Fix subtitles extraction
+ [fox] Add support for locked videos using cookies (#19060)
* [fox] Fix extraction for free videos (#19060)
+ [zattoo] Add support for tv.salt.ch (#19059)
version 2019.01.27
Core
+ [extractor/common] Extract season in _json_ld
* [postprocessor/ffmpeg] Fallback to ffmpeg/avconv for audio codec detection
(#681)
Extractors
* [vice] Fix extraction for locked videos (#16248)
+ [wakanim] Detect DRM protected videos
+ [wakanim] Add support for wakanim.tv (#14374)
* [usatoday] Fix extraction for videos with custom brightcove partner id
(#18990)
* [drtv] Fix extraction (#18989)
* [nhk] Extend URL regular expression (#18968)
* [go] Fix Adobe Pass requests for Disney Now (#18901)
+ [openload] Add support for oload.club (#18969)
version 2019.01.24
Core
* [YoutubeDL] Fix negation for string operators in format selection (#18961)
version 2019.01.23
Core
* [utils] Fix urljoin for paths with non-http(s) schemes
* [extractor/common] Improve jwplayer relative URL handling (#18892)
+ [YoutubeDL] Add negation support for string comparisons in format selection
expressions (#18600, #18805)
* [extractor/common] Improve HLS video-only format detection (#18923)
Extractors
* [crunchyroll] Extend URL regular expression (#18955)
* [pornhub] Bypass scrape detection (#4822, #5930, #7074, #10175, #12722,
#17197, #18338 #18842, #18899)
+ [vrv] Add support for authentication (#14307)
* [videomore:season] Fix extraction
* [videomore] Improve extraction (#18908)
+ [tnaflix] Pass Referer in metadata request (#18925)
* [radiocanada] Relax DRM check (#18608, #18609)
* [vimeo] Fix video password verification for videos protected by
Referer HTTP header
+ [hketv] Add support for hkedcity.net (#18696)
+ [streamango] Add support for fruithosts.net (#18710)
+ [instagram] Add support for tags (#18757)
+ [odnoklassniki] Detect paid videos (#18876)
* [ted] Correct acodec for HTTP formats (#18923)
* [cartoonnetwork] Fix extraction (#15664, #17224)
* [vimeo] Fix extraction for password protected player URLs (#18889)
version 2019.01.17
Extractors
* [youtube] Extend JS player signature function name regular expressions
(#18890, #18891, #18893)
version 2019.01.16 version 2019.01.16
Core Core
@ -276,7 +1222,7 @@ Extractors
+ [youtube] Extract channel meta fields (#9676, #12939) + [youtube] Extract channel meta fields (#9676, #12939)
* [porntube] Fix extraction (#17541) * [porntube] Fix extraction (#17541)
* [asiancrush] Fix extraction (#15630) * [asiancrush] Fix extraction (#15630)
+ [twitch:clips] Extend URL regular expression (closes #17559) + [twitch:clips] Extend URL regular expression (#17559)
+ [vzaar] Add support for HLS + [vzaar] Add support for HLS
* [tube8] Fix metadata extraction (#17520) * [tube8] Fix metadata extraction (#17520)
* [eporner] Extract JSON-LD (#17519) * [eporner] Extract JSON-LD (#17519)

View File

@ -1,7 +1,7 @@
all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
clean: clean:
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.ytdl *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.ytdl *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe
find . -name "*.pyc" -delete find . -name "*.pyc" -delete
find . -name "*.class" -delete find . -name "*.class" -delete
@ -78,8 +78,12 @@ README.md: youtube_dl/*.py youtube_dl/*/*.py
CONTRIBUTING.md: README.md CONTRIBUTING.md: README.md
$(PYTHON) devscripts/make_contributing.py README.md CONTRIBUTING.md $(PYTHON) devscripts/make_contributing.py README.md CONTRIBUTING.md
.github/ISSUE_TEMPLATE.md: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl.md youtube_dl/version.py issuetemplates: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_broken_site.md .github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md .github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md .github/ISSUE_TEMPLATE_tmpl/4_bug_report.md .github/ISSUE_TEMPLATE_tmpl/5_feature_request.md youtube_dl/version.py
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl.md .github/ISSUE_TEMPLATE.md $(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_broken_site.md .github/ISSUE_TEMPLATE/1_broken_site.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md .github/ISSUE_TEMPLATE/2_site_support_request.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md .github/ISSUE_TEMPLATE/3_site_feature_request.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/4_bug_report.md .github/ISSUE_TEMPLATE/4_bug_report.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/5_feature_request.md .github/ISSUE_TEMPLATE/5_feature_request.md
supportedsites: supportedsites:
$(PYTHON) devscripts/make_supportedsites.py docs/supportedsites.md $(PYTHON) devscripts/make_supportedsites.py docs/supportedsites.md

125
README.md
View File

@ -1,4 +1,4 @@
[![Build Status](https://travis-ci.org/rg3/youtube-dl.svg?branch=master)](https://travis-ci.org/rg3/youtube-dl) [![Build Status](https://travis-ci.org/ytdl-org/youtube-dl.svg?branch=master)](https://travis-ci.org/ytdl-org/youtube-dl)
youtube-dl - download videos from youtube.com or other video platforms youtube-dl - download videos from youtube.com or other video platforms
@ -43,7 +43,7 @@ Or with [MacPorts](https://www.macports.org/):
sudo port install youtube-dl sudo port install youtube-dl
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html). Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://ytdl-org.github.io/youtube-dl/download.html).
# DESCRIPTION # DESCRIPTION
**youtube-dl** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on macOS. It is released to the public domain, which means you can modify it, redistribute it or use it however you like. **youtube-dl** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on macOS. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
@ -642,6 +642,7 @@ The simplest case is requesting a specific format, for example with `-f 22` you
You can also use a file extension (currently `3gp`, `aac`, `flv`, `m4a`, `mp3`, `mp4`, `ogg`, `wav`, `webm` are supported) to download the best quality format of a particular file extension served as a single file, e.g. `-f webm` will download the best quality format with the `webm` extension served as a single file. You can also use a file extension (currently `3gp`, `aac`, `flv`, `m4a`, `mp3`, `mp4`, `ogg`, `wav`, `webm` are supported) to download the best quality format of a particular file extension served as a single file, e.g. `-f webm` will download the best quality format with the `webm` extension served as a single file.
You can also use special names to select particular edge case formats: You can also use special names to select particular edge case formats:
- `best`: Select the best quality format represented by a single file with video and audio. - `best`: Select the best quality format represented by a single file with video and audio.
- `worst`: Select the worst quality format represented by a single file with video and audio. - `worst`: Select the worst quality format represented by a single file with video and audio.
- `bestvideo`: Select the best quality video-only format (e.g. DASH video). May not be available. - `bestvideo`: Select the best quality video-only format (e.g. DASH video). May not be available.
@ -658,6 +659,7 @@ If you want to download several formats of the same video use a comma as a separ
You can also filter the video formats by putting a condition in brackets, as in `-f "best[height=720]"` (or `-f "[filesize>10M]"`). You can also filter the video formats by putting a condition in brackets, as in `-f "best[height=720]"` (or `-f "[filesize>10M]"`).
The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `>=`, `=` (equals), `!=` (not equals): The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `>=`, `=` (equals), `!=` (not equals):
- `filesize`: The number of bytes, if known in advance - `filesize`: The number of bytes, if known in advance
- `width`: Width of the video, if known - `width`: Width of the video, if known
- `height`: Height of the video, if known - `height`: Height of the video, if known
@ -667,7 +669,8 @@ The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `
- `asr`: Audio sampling rate in Hertz - `asr`: Audio sampling rate in Hertz
- `fps`: Frame rate - `fps`: Frame rate
Also filtering work for comparisons `=` (equals), `!=` (not equals), `^=` (begins with), `$=` (ends with), `*=` (contains) and following string meta fields: Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends with), `*=` (contains) and following string meta fields:
- `ext`: File extension - `ext`: File extension
- `acodec`: Name of the audio codec in use - `acodec`: Name of the audio codec in use
- `vcodec`: Name of the video codec in use - `vcodec`: Name of the video codec in use
@ -675,6 +678,8 @@ Also filtering work for comparisons `=` (equals), `!=` (not equals), `^=` (begin
- `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `http_dash_segments`, `m3u8`, or `m3u8_native`) - `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `http_dash_segments`, `m3u8`, or `m3u8_native`)
- `format_id`: A short description of the format - `format_id`: A short description of the format
Any string comparison may be prefixed with negation `!` in order to produce an opposite comparison, e.g. `!*=` (does not contain).
Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the video hoster. Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the video hoster.
Formats for which the value is not known are excluded unless you put a question mark (`?`) after the operator. You can combine format filters, so `-f "[height <=? 720][tbr>500]"` selects up to 720p videos (or videos where the height is not known) with a bitrate of at least 500 KBit/s. Formats for which the value is not known are excluded unless you put a question mark (`?`) after the operator. You can combine format filters, so `-f "[height <=? 720][tbr>500]"` selects up to 720p videos (or videos where the height is not known) with a bitrate of at least 500 KBit/s.
@ -683,7 +688,7 @@ You can merge the video and audio of two formats into a single file using `-f <v
Format selectors can also be grouped using parentheses, for example if you want to download the best mp4 and webm formats with a height lower than 480 you can use `-f '(mp4,webm)[height<480]'`. Format selectors can also be grouped using parentheses, for example if you want to download the best mp4 and webm formats with a height lower than 480 you can use `-f '(mp4,webm)[height<480]'`.
Since the end of April 2015 and version 2015.04.26, youtube-dl uses `-f bestvideo+bestaudio/best` as the default format selection (see [#5447](https://github.com/rg3/youtube-dl/issues/5447), [#5456](https://github.com/rg3/youtube-dl/issues/5456)). If ffmpeg or avconv are installed this results in downloading `bestvideo` and `bestaudio` separately and muxing them together into a single file giving the best overall quality available. Otherwise it falls back to `best` and results in downloading the best available quality served as a single file. `best` is also needed for videos that don't come from YouTube because they don't provide the audio and video in two different files. If you want to only download some DASH formats (for example if you are not interested in getting videos with a resolution higher than 1080p), you can add `-f bestvideo[height<=?1080]+bestaudio/best` to your configuration file. Note that if you use youtube-dl to stream to `stdout` (and most likely to pipe it to your media player then), i.e. you explicitly specify output template as `-o -`, youtube-dl still uses `-f best` format selection in order to start content delivery immediately to your player and not to wait until `bestvideo` and `bestaudio` are downloaded and muxed. Since the end of April 2015 and version 2015.04.26, youtube-dl uses `-f bestvideo+bestaudio/best` as the default format selection (see [#5447](https://github.com/ytdl-org/youtube-dl/issues/5447), [#5456](https://github.com/ytdl-org/youtube-dl/issues/5456)). If ffmpeg or avconv are installed this results in downloading `bestvideo` and `bestaudio` separately and muxing them together into a single file giving the best overall quality available. Otherwise it falls back to `best` and results in downloading the best available quality served as a single file. `best` is also needed for videos that don't come from YouTube because they don't provide the audio and video in two different files. If you want to only download some DASH formats (for example if you are not interested in getting videos with a resolution higher than 1080p), you can add `-f bestvideo[height<=?1080]+bestaudio/best` to your configuration file. Note that if you use youtube-dl to stream to `stdout` (and most likely to pipe it to your media player then), i.e. you explicitly specify output template as `-o -`, youtube-dl still uses `-f best` format selection in order to start content delivery immediately to your player and not to wait until `bestvideo` and `bestaudio` are downloaded and muxed.
If you want to preserve the old format selection behavior (prior to youtube-dl 2015.04.26), i.e. you want to download the best available quality media served as a single file, you should explicitly specify your choice with `-f best`. You may want to add it to the [configuration file](#configuration) in order not to type it every time you run youtube-dl. If you want to preserve the old format selection behavior (prior to youtube-dl 2015.04.26), i.e. you want to download the best available quality media served as a single file, you should explicitly specify your choice with `-f best`. You may want to add it to the [configuration file](#configuration) in order not to type it every time you run youtube-dl.
@ -695,7 +700,7 @@ Note that on Windows you may need to use double quotes instead of single.
# Download best mp4 format available or any other best if no mp4 available # Download best mp4 format available or any other best if no mp4 available
$ youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best' $ youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best'
# Download best format available but not better that 480p # Download best format available but no better than 480p
$ youtube-dl -f 'bestvideo[height<=480]+bestaudio/best[height<=480]' $ youtube-dl -f 'bestvideo[height<=480]+bestaudio/best[height<=480]'
# Download best video only format but no bigger than 50 MB # Download best video only format but no bigger than 50 MB
@ -734,7 +739,7 @@ $ youtube-dl --dateafter 20000101 --datebefore 20091231
### How do I update youtube-dl? ### How do I update youtube-dl?
If you've followed [our manual installation instructions](https://rg3.github.io/youtube-dl/download.html), you can simply run `youtube-dl -U` (or, on Linux, `sudo youtube-dl -U`). If you've followed [our manual installation instructions](https://ytdl-org.github.io/youtube-dl/download.html), you can simply run `youtube-dl -U` (or, on Linux, `sudo youtube-dl -U`).
If you have used pip, a simple `sudo pip install -U youtube-dl` is sufficient to update. If you have used pip, a simple `sudo pip install -U youtube-dl` is sufficient to update.
@ -744,11 +749,11 @@ As a last resort, you can also uninstall the version installed by your package m
sudo apt-get remove -y youtube-dl sudo apt-get remove -y youtube-dl
Afterwards, simply follow [our manual installation instructions](https://rg3.github.io/youtube-dl/download.html): Afterwards, simply follow [our manual installation instructions](https://ytdl-org.github.io/youtube-dl/download.html):
``` ```
sudo wget https://yt-dl.org/latest/youtube-dl -O /usr/local/bin/youtube-dl sudo wget https://yt-dl.org/downloads/latest/youtube-dl -O /usr/local/bin/youtube-dl
sudo chmod a+x /usr/local/bin/youtube-dl sudo chmod a+rx /usr/local/bin/youtube-dl
hash -r hash -r
``` ```
@ -778,7 +783,7 @@ Most people asking this question are not aware that youtube-dl now defaults to d
### I get HTTP error 402 when trying to download a video. What's this? ### I get HTTP error 402 when trying to download a video. What's this?
Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We're [considering to provide a way to let you solve the CAPTCHA](https://github.com/rg3/youtube-dl/issues/154), but at the moment, your best course of action is pointing a web browser to the youtube URL, solving the CAPTCHA, and restart youtube-dl. Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We're [considering to provide a way to let you solve the CAPTCHA](https://github.com/ytdl-org/youtube-dl/issues/154), but at the moment, your best course of action is pointing a web browser to the youtube URL, solving the CAPTCHA, and restart youtube-dl.
### Do I need any other programs? ### Do I need any other programs?
@ -843,7 +848,7 @@ means you're using an outdated version of Python. Please update to Python 2.6 or
### What is this binary file? Where has the code gone? ### What is this binary file? Where has the code gone?
Since June 2012 ([#342](https://github.com/rg3/youtube-dl/issues/342)) youtube-dl is packed as an executable zipfile, simply unzip it (might need renaming to `youtube-dl.zip` first on some systems) or clone the git repository, as laid out above. If you modify the code, you can run it by executing the `__main__.py` file. To recompile the executable, run `make youtube-dl`. Since June 2012 ([#342](https://github.com/ytdl-org/youtube-dl/issues/342)) youtube-dl is packed as an executable zipfile, simply unzip it (might need renaming to `youtube-dl.zip` first on some systems) or clone the git repository, as laid out above. If you modify the code, you can run it by executing the `__main__.py` file. To recompile the executable, run `make youtube-dl`.
### The exe throws an error due to missing `MSVCR100.dll` ### The exe throws an error due to missing `MSVCR100.dll`
@ -902,7 +907,7 @@ When youtube-dl detects an HLS video, it can download it either with the built-i
When youtube-dl knows that one particular downloader works better for a given website, that downloader will be picked. Otherwise, youtube-dl will pick the best downloader for general compatibility, which at the moment happens to be ffmpeg. This choice may change in future versions of youtube-dl, with improvements of the built-in downloader and/or ffmpeg. When youtube-dl knows that one particular downloader works better for a given website, that downloader will be picked. Otherwise, youtube-dl will pick the best downloader for general compatibility, which at the moment happens to be ffmpeg. This choice may change in future versions of youtube-dl, with improvements of the built-in downloader and/or ffmpeg.
In particular, the generic extractor (used when your website is not in the [list of supported sites by youtube-dl](https://rg3.github.io/youtube-dl/supportedsites.html) cannot mandate one specific downloader. In particular, the generic extractor (used when your website is not in the [list of supported sites by youtube-dl](https://ytdl-org.github.io/youtube-dl/supportedsites.html) cannot mandate one specific downloader.
If you put either `--hls-prefer-native` or `--hls-prefer-ffmpeg` into your configuration, a different subset of videos will fail to download correctly. Instead, it is much better to [file an issue](https://yt-dl.org/bug) or a pull request which details why the native or the ffmpeg HLS downloader is a better choice for your use case. If you put either `--hls-prefer-native` or `--hls-prefer-ffmpeg` into your configuration, a different subset of videos will fail to download correctly. Instead, it is much better to [file an issue](https://yt-dl.org/bug) or a pull request which details why the native or the ffmpeg HLS downloader is a better choice for your use case.
@ -942,7 +947,7 @@ youtube-dl is an open-source project manned by too few volunteers, so we'd rathe
# DEVELOPER INSTRUCTIONS # DEVELOPER INSTRUCTIONS
Most users do not need to build youtube-dl and can [download the builds](https://rg3.github.io/youtube-dl/download.html) or get them from their distribution. Most users do not need to build youtube-dl and can [download the builds](https://ytdl-org.github.io/youtube-dl/download.html) or get them from their distribution.
To run youtube-dl as a developer, you don't need to build anything either. Simply execute To run youtube-dl as a developer, you don't need to build anything either. Simply execute
@ -970,7 +975,7 @@ If you want to add support for a new site, first of all **make sure** this site
After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`): After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`):
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork) 1. [Fork this repository](https://github.com/ytdl-org/youtube-dl/fork)
2. Check out the source code with: 2. Check out the source code with:
git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git
@ -1022,9 +1027,9 @@ After you have ensured this site is distributing its content legally, you can fo
# TODO more properties (see youtube_dl/extractor/common.py) # TODO more properties (see youtube_dl/extractor/common.py)
} }
``` ```
5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/extractors.py). 5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in. 6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in.
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303). Add tests and code for as many as you want. 7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303). Add tests and code for as many as you want.
8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](http://flake8.pycqa.org/en/latest/index.html#quickstart): 8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](http://flake8.pycqa.org/en/latest/index.html#quickstart):
$ flake8 youtube_dl/extractor/yourextractor.py $ flake8 youtube_dl/extractor/yourextractor.py
@ -1049,7 +1054,7 @@ Extractors are very fragile by nature since they depend on the layout of the sou
### Mandatory and optional metafields ### Mandatory and optional metafields
For extraction to work youtube-dl relies on metadata your extractor extracts and provides to youtube-dl expressed by an [information dictionary](https://github.com/rg3/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303) or simply *info dict*. Only the following meta fields in the *info dict* are considered mandatory for a successful extraction process by youtube-dl: For extraction to work youtube-dl relies on metadata your extractor extracts and provides to youtube-dl expressed by an [information dictionary](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303) or simply *info dict*. Only the following meta fields in the *info dict* are considered mandatory for a successful extraction process by youtube-dl:
- `id` (media identifier) - `id` (media identifier)
- `title` (media title) - `title` (media title)
@ -1057,7 +1062,7 @@ For extraction to work youtube-dl relies on metadata your extractor extracts and
In fact only the last option is technically mandatory (i.e. if you can't figure out the download location of the media the extraction does not make any sense). But by convention youtube-dl also treats `id` and `title` as mandatory. Thus the aforementioned metafields are the critical data that the extraction does not make any sense without and if any of them fail to be extracted then the extractor is considered completely broken. In fact only the last option is technically mandatory (i.e. if you can't figure out the download location of the media the extraction does not make any sense). But by convention youtube-dl also treats `id` and `title` as mandatory. Thus the aforementioned metafields are the critical data that the extraction does not make any sense without and if any of them fail to be extracted then the extractor is considered completely broken.
[Any field](https://github.com/rg3/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L188-L303) apart from the aforementioned ones are considered **optional**. That means that extraction should be **tolerant** to situations when sources for these fields can potentially be unavailable (even if they are always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields. [Any field](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L188-L303) apart from the aforementioned ones are considered **optional**. That means that extraction should be **tolerant** to situations when sources for these fields can potentially be unavailable (even if they are always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields.
#### Example #### Example
@ -1211,15 +1216,83 @@ Incorrect:
'PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4' 'PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4'
``` ```
### Use safe conversion functions ### Inline values
Wrap all extracted numeric data into safe functions from [`youtube_dl/utils.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/utils.py): `int_or_none`, `float_or_none`. Use them for string to number conversions as well. Extracting variables is acceptable for reducing code duplication and improving readability of complex expressions. However, you should avoid extracting variables used only once and moving them to opposite parts of the extractor file, which makes reading the linear flow difficult.
#### Example
Correct:
```python
title = self._html_search_regex(r'<title>([^<]+)</title>', webpage, 'title')
```
Incorrect:
```python
TITLE_RE = r'<title>([^<]+)</title>'
# ...some lines of code...
title = self._html_search_regex(TITLE_RE, webpage, 'title')
```
### Collapse fallbacks
Multiple fallback values can quickly become unwieldy. Collapse multiple fallback values into a single expression via a list of patterns.
#### Example
Good:
```python
description = self._html_search_meta(
['og:description', 'description', 'twitter:description'],
webpage, 'description', default=None)
```
Unwieldy:
```python
description = (
self._og_search_description(webpage, default=None)
or self._html_search_meta('description', webpage, default=None)
or self._html_search_meta('twitter:description', webpage, default=None))
```
Methods supporting list of patterns are: `_search_regex`, `_html_search_regex`, `_og_search_property`, `_html_search_meta`.
### Trailing parentheses
Always move trailing parentheses after the last argument.
#### Example
Correct:
```python
lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
list)
```
Incorrect:
```python
lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
list,
)
```
### Use convenience conversion and parsing functions
Wrap all extracted numeric data into safe functions from [`youtube_dl/utils.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/utils.py): `int_or_none`, `float_or_none`. Use them for string to number conversions as well.
Use `url_or_none` for safe URL processing. Use `url_or_none` for safe URL processing.
Use `try_get` for safe metadata extraction from parsed JSON. Use `try_get` for safe metadata extraction from parsed JSON.
Explore [`youtube_dl/utils.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/utils.py) for more useful convenience functions. Use `unified_strdate` for uniform `upload_date` or any `YYYYMMDD` meta field extraction, `unified_timestamp` for uniform `timestamp` extraction, `parse_filesize` for `filesize` extraction, `parse_count` for count meta fields extraction, `parse_resolution`, `parse_duration` for `duration` extraction, `parse_age_limit` for `age_limit` extraction.
Explore [`youtube_dl/utils.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/utils.py) for more useful convenience functions.
#### More examples #### More examples
@ -1238,7 +1311,7 @@ view_count = int_or_none(video.get('views'))
# EMBEDDING YOUTUBE-DL # EMBEDDING YOUTUBE-DL
youtube-dl makes the best effort to be a good command-line program, and thus should be callable from any programming language. If you encounter any problems parsing its output, feel free to [create a report](https://github.com/rg3/youtube-dl/issues/new). youtube-dl makes the best effort to be a good command-line program, and thus should be callable from any programming language. If you encounter any problems parsing its output, feel free to [create a report](https://github.com/ytdl-org/youtube-dl/issues/new).
From a Python program, you can embed youtube-dl in a more powerful fashion, like this: From a Python program, you can embed youtube-dl in a more powerful fashion, like this:
@ -1251,7 +1324,7 @@ with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.download(['https://www.youtube.com/watch?v=BaW_jenozKc']) ydl.download(['https://www.youtube.com/watch?v=BaW_jenozKc'])
``` ```
Most likely, you'll want to use various options. For a list of options available, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/rg3/youtube-dl/blob/3e4cedf9e8cd3157df2457df7274d0c842421945/youtube_dl/YoutubeDL.py#L137-L312). For a start, if you want to intercept youtube-dl's output, set a `logger` object. Most likely, you'll want to use various options. For a list of options available, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/ytdl-org/youtube-dl/blob/3e4cedf9e8cd3157df2457df7274d0c842421945/youtube_dl/YoutubeDL.py#L137-L312). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
Here's a more complete example of a program that outputs only errors (and a short message after the download is finished), and downloads/converts the video to an mp3 file: Here's a more complete example of a program that outputs only errors (and a short message after the download is finished), and downloads/converts the video to an mp3 file:
@ -1292,7 +1365,7 @@ with youtube_dl.YoutubeDL(ydl_opts) as ydl:
# BUGS # BUGS
Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues>. Unless you were prompted to or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email. For discussions, join us in the IRC channel [#youtube-dl](irc://chat.freenode.net/#youtube-dl) on freenode ([webchat](https://webchat.freenode.net/?randomnick=1&channels=youtube-dl)). Bugs and suggestions should be reported at: <https://github.com/ytdl-org/youtube-dl/issues>. Unless you were prompted to or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email. For discussions, join us in the IRC channel [#youtube-dl](irc://chat.freenode.net/#youtube-dl) on freenode ([webchat](https://webchat.freenode.net/?randomnick=1&channels=youtube-dl)).
**Please include the full output of youtube-dl when run with `-v`**, i.e. **add** `-v` flag to **your command line**, copy the **whole** output and post it in the issue body wrapped in \`\`\` for better formatting. It should look similar to this: **Please include the full output of youtube-dl when run with `-v`**, i.e. **add** `-v` flag to **your command line**, copy the **whole** output and post it in the issue body wrapped in \`\`\` for better formatting. It should look similar to this:
``` ```
@ -1338,11 +1411,11 @@ Before reporting any issue, type `youtube-dl -U`. This should report that you're
### Is the issue already documented? ### Is the issue already documented?
Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or browse the [GitHub Issues](https://github.com/rg3/youtube-dl/search?type=Issues) of this repository. If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity. Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or browse the [GitHub Issues](https://github.com/ytdl-org/youtube-dl/search?type=Issues) of this repository. If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
### Why are existing options not enough? ### Why are existing options not enough?
Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/rg3/youtube-dl/blob/master/README.md#options). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem. Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/ytdl-org/youtube-dl/blob/master/README.md#options). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
### Is there enough context in your bug report? ### Is there enough context in your bug report?

View File

@ -322,7 +322,7 @@ class GITBuilder(GITInfoBuilder):
class YoutubeDLBuilder(object): class YoutubeDLBuilder(object):
authorizedUsers = ['fraca7', 'phihag', 'rg3', 'FiloSottile'] authorizedUsers = ['fraca7', 'phihag', 'rg3', 'FiloSottile', 'ytdl-org']
def __init__(self, **kwargs): def __init__(self, **kwargs):
if self.repoName != 'youtube-dl': if self.repoName != 'youtube-dl':

View File

@ -45,12 +45,12 @@ for test in gettestcases():
RESULT = ('.' + domain + '\n' in LIST or '\n' + domain + '\n' in LIST) RESULT = ('.' + domain + '\n' in LIST or '\n' + domain + '\n' in LIST)
if RESULT and ('info_dict' not in test or 'age_limit' not in test['info_dict'] or if RESULT and ('info_dict' not in test or 'age_limit' not in test['info_dict']
test['info_dict']['age_limit'] != 18): or test['info_dict']['age_limit'] != 18):
print('\nPotential missing age_limit check: {0}'.format(test['name'])) print('\nPotential missing age_limit check: {0}'.format(test['name']))
elif not RESULT and ('info_dict' in test and 'age_limit' in test['info_dict'] and elif not RESULT and ('info_dict' in test and 'age_limit' in test['info_dict']
test['info_dict']['age_limit'] == 18): and test['info_dict']['age_limit'] == 18):
print('\nPotential false negative: {0}'.format(test['name'])) print('\nPotential false negative: {0}'.format(test['name']))
else: else:

View File

@ -1,7 +1,6 @@
#!/usr/bin/env python #!/usr/bin/env python
from __future__ import unicode_literals from __future__ import unicode_literals
import base64
import io import io
import json import json
import mimetypes import mimetypes
@ -15,7 +14,6 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.compat import ( from youtube_dl.compat import (
compat_basestring, compat_basestring,
compat_input,
compat_getpass, compat_getpass,
compat_print, compat_print,
compat_urllib_request, compat_urllib_request,
@ -27,8 +25,8 @@ from youtube_dl.utils import (
class GitHubReleaser(object): class GitHubReleaser(object):
_API_URL = 'https://api.github.com/repos/rg3/youtube-dl/releases' _API_URL = 'https://api.github.com/repos/ytdl-org/youtube-dl/releases'
_UPLOADS_URL = 'https://uploads.github.com/repos/rg3/youtube-dl/releases/%s/assets?name=%s' _UPLOADS_URL = 'https://uploads.github.com/repos/ytdl-org/youtube-dl/releases/%s/assets?name=%s'
_NETRC_MACHINE = 'github.com' _NETRC_MACHINE = 'github.com'
def __init__(self, debuglevel=0): def __init__(self, debuglevel=0):
@ -40,28 +38,20 @@ class GitHubReleaser(object):
try: try:
info = netrc.netrc().authenticators(self._NETRC_MACHINE) info = netrc.netrc().authenticators(self._NETRC_MACHINE)
if info is not None: if info is not None:
self._username = info[0] self._token = info[2]
self._password = info[2]
compat_print('Using GitHub credentials found in .netrc...') compat_print('Using GitHub credentials found in .netrc...')
return return
else: else:
compat_print('No GitHub credentials found in .netrc') compat_print('No GitHub credentials found in .netrc')
except (IOError, netrc.NetrcParseError): except (IOError, netrc.NetrcParseError):
compat_print('Unable to parse .netrc') compat_print('Unable to parse .netrc')
self._username = compat_input( self._token = compat_getpass(
'Type your GitHub username or email address and press [Return]: ') 'Type your GitHub PAT (personal access token) and press [Return]: ')
self._password = compat_getpass(
'Type your GitHub password and press [Return]: ')
def _call(self, req): def _call(self, req):
if isinstance(req, compat_basestring): if isinstance(req, compat_basestring):
req = sanitized_Request(req) req = sanitized_Request(req)
# Authorizing manually since GitHub does not response with 401 with req.add_header('Authorization', 'token %s' % self._token)
# WWW-Authenticate header set (see
# https://developer.github.com/v3/#basic-authentication)
b64 = base64.b64encode(
('%s:%s' % (self._username, self._password)).encode('utf-8')).decode('ascii')
req.add_header('Authorization', 'Basic %s' % b64)
response = self._opener.open(req).read().decode('utf-8') response = self._opener.open(req).read().decode('utf-8')
return json.loads(response) return json.loads(response)

View File

@ -10,7 +10,7 @@ import textwrap
atom_template = textwrap.dedent("""\ atom_template = textwrap.dedent("""\
<?xml version="1.0" encoding="utf-8"?> <?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"> <feed xmlns="http://www.w3.org/2005/Atom">
<link rel="self" href="http://rg3.github.io/youtube-dl/update/releases.atom" /> <link rel="self" href="http://ytdl-org.github.io/youtube-dl/update/releases.atom" />
<title>youtube-dl releases</title> <title>youtube-dl releases</title>
<id>https://yt-dl.org/feed/youtube-dl-updates-feed</id> <id>https://yt-dl.org/feed/youtube-dl-updates-feed</id>
<updated>@TIMESTAMP@</updated> <updated>@TIMESTAMP@</updated>
@ -21,7 +21,7 @@ entry_template = textwrap.dedent("""
<entry> <entry>
<id>https://yt-dl.org/feed/youtube-dl-updates-feed/youtube-dl-@VERSION@</id> <id>https://yt-dl.org/feed/youtube-dl-updates-feed/youtube-dl-@VERSION@</id>
<title>New version @VERSION@</title> <title>New version @VERSION@</title>
<link href="http://rg3.github.io/youtube-dl" /> <link href="http://ytdl-org.github.io/youtube-dl" />
<content type="xhtml"> <content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml"> <div xmlns="http://www.w3.org/1999/xhtml">
Downloads available at <a href="https://yt-dl.org/downloads/@VERSION@/">https://yt-dl.org/downloads/@VERSION@/</a> Downloads available at <a href="https://yt-dl.org/downloads/@VERSION@/">https://yt-dl.org/downloads/@VERSION@/</a>

View File

@ -78,8 +78,8 @@ sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py
sed -i "s/<unreleased>/$version/" ChangeLog sed -i "s/<unreleased>/$version/" ChangeLog
/bin/echo -e "\n### Committing documentation, templates and youtube_dl/version.py..." /bin/echo -e "\n### Committing documentation, templates and youtube_dl/version.py..."
make README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md supportedsites make README.md CONTRIBUTING.md issuetemplates supportedsites
git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md docs/supportedsites.md youtube_dl/version.py ChangeLog git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE/1_broken_site.md .github/ISSUE_TEMPLATE/2_site_support_request.md .github/ISSUE_TEMPLATE/3_site_feature_request.md .github/ISSUE_TEMPLATE/4_bug_report.md .github/ISSUE_TEMPLATE/5_feature_request.md .github/ISSUE_TEMPLATE/6_question.md docs/supportedsites.md youtube_dl/version.py ChangeLog
git commit $gpg_sign_commits -m "release $version" git commit $gpg_sign_commits -m "release $version"
/bin/echo -e "\n### Now tagging, signing and pushing..." /bin/echo -e "\n### Now tagging, signing and pushing..."
@ -96,7 +96,7 @@ git push origin "$version"
REV=$(git rev-parse HEAD) REV=$(git rev-parse HEAD)
make youtube-dl youtube-dl.tar.gz make youtube-dl youtube-dl.tar.gz
read -p "VM running? (y/n) " -n 1 read -p "VM running? (y/n) " -n 1
wget "http://$buildserver/build/rg3/youtube-dl/youtube-dl.exe?rev=$REV" -O youtube-dl.exe wget "http://$buildserver/build/ytdl-org/youtube-dl/youtube-dl.exe?rev=$REV" -O youtube-dl.exe
mkdir -p "build/$version" mkdir -p "build/$version"
mv youtube-dl youtube-dl.exe "build/$version" mv youtube-dl youtube-dl.exe "build/$version"
mv youtube-dl.tar.gz "build/$version/youtube-dl-$version.tar.gz" mv youtube-dl.tar.gz "build/$version/youtube-dl-$version.tar.gz"

View File

@ -24,7 +24,7 @@ total_bytes = 0
for page in itertools.count(1): for page in itertools.count(1):
releases = json.loads(compat_urllib_request.urlopen( releases = json.loads(compat_urllib_request.urlopen(
'https://api.github.com/repos/rg3/youtube-dl/releases?page=%s' % page 'https://api.github.com/repos/ytdl-org/youtube-dl/releases?page=%s' % page
).read().decode('utf-8')) ).read().decode('utf-8'))
if not releases: if not releases:

View File

@ -26,12 +26,13 @@
- **AcademicEarth:Course** - **AcademicEarth:Course**
- **acast** - **acast**
- **acast:channel** - **acast:channel**
- **AddAnime**
- **ADN**: Anime Digital Network - **ADN**: Anime Digital Network
- **AdobeTV** - **AdobeConnect**
- **AdobeTVChannel** - **adobetv**
- **AdobeTVShow** - **adobetv:channel**
- **AdobeTVVideo** - **adobetv:embed**
- **adobetv:show**
- **adobetv:video**
- **AdultSwim** - **AdultSwim**
- **aenetworks**: A+E Networks: A&E, Lifetime, History.com, FYI Network and History Vault - **aenetworks**: A+E Networks: A&E, Lifetime, History.com, FYI Network and History Vault
- **afreecatv**: afreecatv.com - **afreecatv**: afreecatv.com
@ -44,9 +45,8 @@
- **AmericasTestKitchen** - **AmericasTestKitchen**
- **anderetijden**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl - **anderetijden**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **AnimeOnDemand** - **AnimeOnDemand**
- **anitube.se**
- **Anvato** - **Anvato**
- **AnySex** - **aol.com**
- **APA** - **APA**
- **Aparat** - **Aparat**
- **AppleConnect** - **AppleConnect**
@ -58,16 +58,8 @@
- **ARD:mediathek** - **ARD:mediathek**
- **ARDBetaMediathek** - **ARDBetaMediathek**
- **Arkena** - **Arkena**
- **arte.tv**
- **arte.tv:+7** - **arte.tv:+7**
- **arte.tv:cinema**
- **arte.tv:concert**
- **arte.tv:creative**
- **arte.tv:ddc**
- **arte.tv:embed** - **arte.tv:embed**
- **arte.tv:future**
- **arte.tv:info**
- **arte.tv:magazine**
- **arte.tv:playlist** - **arte.tv:playlist**
- **AsianCrush** - **AsianCrush**
- **AsianCrushPlaylist** - **AsianCrushPlaylist**
@ -78,15 +70,12 @@
- **AudioBoom** - **AudioBoom**
- **audiomack** - **audiomack**
- **audiomack:album** - **audiomack:album**
- **auroravid**: AuroraVid
- **AWAAN** - **AWAAN**
- **awaan:live** - **awaan:live**
- **awaan:season** - **awaan:season**
- **awaan:video** - **awaan:video**
- **AZMedien**: AZ Medien videos - **AZMedien**: AZ Medien videos
- **BaiduVideo**: 百度视频 - **BaiduVideo**: 百度视频
- **bambuser**
- **bambuser:channel**
- **Bandcamp** - **Bandcamp**
- **Bandcamp:album** - **Bandcamp:album**
- **Bandcamp:weekly** - **Bandcamp:weekly**
@ -103,9 +92,12 @@
- **Bellator** - **Bellator**
- **BellMedia** - **BellMedia**
- **Bet** - **Bet**
- **bfi:player**
- **Bigflix** - **Bigflix**
- **Bild**: Bild.de - **Bild**: Bild.de
- **BiliBili** - **BiliBili**
- **BilibiliAudio**
- **BilibiliAudioAlbum**
- **BioBioChileTV** - **BioBioChileTV**
- **BIQLE** - **BIQLE**
- **BitChute** - **BitChute**
@ -149,6 +141,7 @@
- **CBSInteractive** - **CBSInteractive**
- **CBSLocal** - **CBSLocal**
- **cbsnews**: CBS News - **cbsnews**: CBS News
- **cbsnews:embed**
- **cbsnews:livevideo**: CBS News Live Videos - **cbsnews:livevideo**: CBS News Live Videos
- **CBSSports** - **CBSSports**
- **CCMA** - **CCMA**
@ -163,6 +156,7 @@
- **chirbit** - **chirbit**
- **chirbit:profile** - **chirbit:profile**
- **Cinchcast** - **Cinchcast**
- **Cinemax**
- **CiscoLiveSearch** - **CiscoLiveSearch**
- **CiscoLiveSession** - **CiscoLiveSession**
- **CJSW** - **CJSW**
@ -172,7 +166,6 @@
- **Clipsyndicate** - **Clipsyndicate**
- **CloserToTruth** - **CloserToTruth**
- **CloudflareStream** - **CloudflareStream**
- **cloudtime**: CloudTime
- **Cloudy** - **Cloudy**
- **Clubic** - **Clubic**
- **Clyp** - **Clyp**
@ -182,17 +175,16 @@
- **CNN** - **CNN**
- **CNNArticle** - **CNNArticle**
- **CNNBlogs** - **CNNBlogs**
- **ComCarCoff**
- **ComedyCentral** - **ComedyCentral**
- **ComedyCentralFullEpisodes** - **ComedyCentralFullEpisodes**
- **ComedyCentralShortname** - **ComedyCentralShortname**
- **ComedyCentralTV** - **ComedyCentralTV**
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED - **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
- **CONtv**
- **Corus** - **Corus**
- **Coub** - **Coub**
- **Cracked** - **Cracked**
- **Crackle** - **Crackle**
- **Criterion**
- **CrooksAndLiars** - **CrooksAndLiars**
- **crunchyroll** - **crunchyroll**
- **crunchyroll:playlist** - **crunchyroll:playlist**
@ -200,6 +192,7 @@
- **CSpan**: C-SPAN - **CSpan**: C-SPAN
- **CtsNews**: 華視新聞 - **CtsNews**: 華視新聞
- **CTVNews** - **CTVNews**
- **cu.ntv.co.jp**: Nippon Television Network
- **Culturebox** - **Culturebox**
- **CultureUnplugged** - **CultureUnplugged**
- **curiositystream** - **curiositystream**
@ -209,8 +202,6 @@
- **dailymotion** - **dailymotion**
- **dailymotion:playlist** - **dailymotion:playlist**
- **dailymotion:user** - **dailymotion:user**
- **DaisukiMotto**
- **DaisukiMottoPlaylist**
- **daum.net** - **daum.net**
- **daum.net:clip** - **daum.net:clip**
- **daum.net:playlist** - **daum.net:playlist**
@ -230,13 +221,12 @@
- **DiscoveryNetworksDe** - **DiscoveryNetworksDe**
- **DiscoveryVR** - **DiscoveryVR**
- **Disney** - **Disney**
- **dlive:stream**
- **dlive:vod**
- **Dotsub** - **Dotsub**
- **DouyuShow** - **DouyuShow**
- **DouyuTV**: 斗鱼 - **DouyuTV**: 斗鱼
- **DPlay** - **DPlay**
- **DPlayIt**
- **dramafever**
- **dramafever:series**
- **DRBonanza** - **DRBonanza**
- **Dropbox** - **Dropbox**
- **DrTuber** - **DrTuber**
@ -289,12 +279,12 @@
- **FiveThirtyEight** - **FiveThirtyEight**
- **FiveTV** - **FiveTV**
- **Flickr** - **Flickr**
- **Flipagram**
- **Folketinget**: Folketinget (ft.dk; Danish parliament) - **Folketinget**: Folketinget (ft.dk; Danish parliament)
- **FootyRoom** - **FootyRoom**
- **Formula1** - **Formula1**
- **FOX** - **FOX**
- **FOX9** - **FOX9**
- **FOX9News**
- **Foxgay** - **Foxgay**
- **foxnews**: Fox News and Fox Business Video - **foxnews**: Fox News and Fox Business Video
- **foxnews:article** - **foxnews:article**
@ -314,16 +304,12 @@
- **FrontendMastersCourse** - **FrontendMastersCourse**
- **FrontendMastersLesson** - **FrontendMastersLesson**
- **Funimation** - **Funimation**
- **FunkChannel** - **Funk**
- **FunkMix**
- **FunnyOrDie**
- **Fusion** - **Fusion**
- **Fux** - **Fux**
- **FXNetworks** - **FXNetworks**
- **Gaia** - **Gaia**
- **GameInformer** - **GameInformer**
- **GameOne**
- **gameone:playlist**
- **GameSpot** - **GameSpot**
- **GameStar** - **GameStar**
- **Gaskrank** - **Gaskrank**
@ -338,16 +324,13 @@
- **Globo** - **Globo**
- **GloboArticle** - **GloboArticle**
- **Go** - **Go**
- **Go90**
- **GodTube** - **GodTube**
- **Golem** - **Golem**
- **GoogleDrive** - **GoogleDrive**
- **Goshgay** - **Goshgay**
- **GPUTechConf** - **GPUTechConf**
- **Groupon** - **Groupon**
- **Hark**
- **hbo** - **hbo**
- **hbo:episode**
- **HearThisAt** - **HearThisAt**
- **Heise** - **Heise**
- **HellPorno** - **HellPorno**
@ -361,6 +344,7 @@
- **hitbox** - **hitbox**
- **hitbox:live** - **hitbox:live**
- **HitRecord** - **HitRecord**
- **hketv**: 香港教育局教育電視 (HKETV) Educational Television, Hong Kong Educational Bureau
- **HornBunny** - **HornBunny**
- **HotNewHipHop** - **HotNewHipHop**
- **hotstar** - **hotstar**
@ -374,7 +358,6 @@
- **Hungama** - **Hungama**
- **HungamaSong** - **HungamaSong**
- **Hypem** - **Hypem**
- **Iconosquare**
- **ign.com** - **ign.com**
- **imdb**: Internet Movie Database trailers - **imdb**: Internet Movie Database trailers
- **imdb:list**: Internet Movie Database lists - **imdb:list**: Internet Movie Database lists
@ -386,6 +369,7 @@
- **IndavideoEmbed** - **IndavideoEmbed**
- **InfoQ** - **InfoQ**
- **Instagram** - **Instagram**
- **instagram:tag**: Instagram hashtag search
- **instagram:user**: Instagram user profile - **instagram:user**: Instagram user profile
- **Internazionale** - **Internazionale**
- **InternetVideoArchive** - **InternetVideoArchive**
@ -413,14 +397,14 @@
- **Kankan** - **Kankan**
- **Karaoketv** - **Karaoketv**
- **KarriereVideos** - **KarriereVideos**
- **keek** - **Katsomo**
- **KeezMovies** - **KeezMovies**
- **Ketnet** - **Ketnet**
- **KhanAcademy** - **KhanAcademy**
- **KickStarter** - **KickStarter**
- **KinjaEmbed**
- **KinoPoisk** - **KinoPoisk**
- **KonserthusetPlay** - **KonserthusetPlay**
- **kontrtube**: KontrTube.ru - Труба зовёт
- **KrasView**: Красвью - **KrasView**: Красвью
- **Ku6** - **Ku6**
- **KUSI** - **KUSI**
@ -437,7 +421,6 @@
- **Lcp** - **Lcp**
- **LcpPlay** - **LcpPlay**
- **Le**: 乐视网 - **Le**: 乐视网
- **Learnr**
- **Lecture2Go** - **Lecture2Go**
- **Lecturio** - **Lecturio**
- **LecturioCourse** - **LecturioCourse**
@ -456,7 +439,9 @@
- **LineTV** - **LineTV**
- **linkedin:learning** - **linkedin:learning**
- **linkedin:learning:course** - **linkedin:learning:course**
- **LinuxAcademy**
- **LiTV** - **LiTV**
- **LiveJournal**
- **LiveLeak** - **LiveLeak**
- **LiveLeakEmbed** - **LiveLeakEmbed**
- **livestream** - **livestream**
@ -469,11 +454,10 @@
- **lynda**: lynda.com videos - **lynda**: lynda.com videos
- **lynda:course**: lynda.com online courses - **lynda:course**: lynda.com online courses
- **m6** - **m6**
- **macgamestore**: MacGameStore trailers
- **mailru**: Видео@Mail.Ru - **mailru**: Видео@Mail.Ru
- **mailru:music**: Музыка@Mail.Ru - **mailru:music**: Музыка@Mail.Ru
- **mailru:music:search**: Музыка@Mail.Ru - **mailru:music:search**: Музыка@Mail.Ru
- **MakerTV** - **MallTV**
- **mangomolo:live** - **mangomolo:live**
- **mangomolo:video** - **mangomolo:video**
- **ManyVids** - **ManyVids**
@ -483,9 +467,12 @@
- **MatchTV** - **MatchTV**
- **MDR**: MDR.DE and KiKA - **MDR**: MDR.DE and KiKA
- **media.ccc.de** - **media.ccc.de**
- **media.ccc.de:lists**
- **Medialaan** - **Medialaan**
- **Mediaset** - **Mediaset**
- **Mediasite** - **Mediasite**
- **MediasiteCatalog**
- **MediasiteNamedCatalog**
- **Medici** - **Medici**
- **megaphone.fm**: megaphone.fm embedded players - **megaphone.fm**: megaphone.fm embedded players
- **Meipai**: 美拍 - **Meipai**: 美拍
@ -496,14 +483,12 @@
- **Mgoon** - **Mgoon**
- **MGTV**: 芒果TV - **MGTV**: 芒果TV
- **MiaoPai** - **MiaoPai**
- **Minhateca**
- **MinistryGrid** - **MinistryGrid**
- **Minoto** - **Minoto**
- **miomio.tv** - **miomio.tv**
- **MiTele**: mitele.es - **MiTele**: mitele.es
- **mixcloud** - **mixcloud**
- **mixcloud:playlist** - **mixcloud:playlist**
- **mixcloud:stream**
- **mixcloud:user** - **mixcloud:user**
- **Mixer:live** - **Mixer:live**
- **Mixer:vod** - **Mixer:vod**
@ -525,11 +510,10 @@
- **mtg**: MTG services - **mtg**: MTG services
- **mtv** - **mtv**
- **mtv.de** - **mtv.de**
- **mtv81**
- **mtv:video** - **mtv:video**
- **mtvjapan**
- **mtvservices:embedded** - **mtvservices:embedded**
- **MuenchenTV**: münchen.tv - **MuenchenTV**: münchen.tv
- **MusicPlayOn**
- **mva**: Microsoft Virtual Academy videos - **mva**: Microsoft Virtual Academy videos
- **mva:course**: Microsoft Virtual Academy courses - **mva:course**: Microsoft Virtual Academy courses
- **Mwave** - **Mwave**
@ -544,6 +528,7 @@
- **MyVisionTV** - **MyVisionTV**
- **n-tv.de** - **n-tv.de**
- **natgeo:video** - **natgeo:video**
- **NationalGeographicTV**
- **Naver** - **Naver**
- **NBA** - **NBA**
- **NBC** - **NBC**
@ -575,7 +560,6 @@
- **NextTV**: 壹電視 - **NextTV**: 壹電視
- **Nexx** - **Nexx**
- **NexxEmbed** - **NexxEmbed**
- **nfb**: National Film Board of Canada
- **nfl.com** - **nfl.com**
- **NhkVod** - **NhkVod**
- **nhl.com** - **nhl.com**
@ -601,7 +585,6 @@
- **nowness** - **nowness**
- **nowness:playlist** - **nowness:playlist**
- **nowness:series** - **nowness:series**
- **nowvideo**: NowVideo
- **Noz** - **Noz**
- **npo**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl - **npo**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **npo.nl:live** - **npo.nl:live**
@ -617,6 +600,7 @@
- **NRKTVEpisodes** - **NRKTVEpisodes**
- **NRKTVSeason** - **NRKTVSeason**
- **NRKTVSeries** - **NRKTVSeries**
- **NRLTV**
- **ntv.ru** - **ntv.ru**
- **Nuvid** - **Nuvid**
- **NYTimes** - **NYTimes**
@ -626,7 +610,6 @@
- **OdaTV** - **OdaTV**
- **Odnoklassniki** - **Odnoklassniki**
- **OktoberfestTV** - **OktoberfestTV**
- **on.aol.com**
- **OnDemandKorea** - **OnDemandKorea**
- **onet.pl** - **onet.pl**
- **onet.tv** - **onet.tv**
@ -635,7 +618,6 @@
- **OnionStudios** - **OnionStudios**
- **Ooyala** - **Ooyala**
- **OoyalaExternal** - **OoyalaExternal**
- **Openload**
- **OraTV** - **OraTV**
- **orf:fm4**: radio FM4 - **orf:fm4**: radio FM4
- **orf:fm4:story**: fm4.orf.at stories - **orf:fm4:story**: fm4.orf.at stories
@ -667,6 +649,8 @@
- **Piksel** - **Piksel**
- **Pinkbike** - **Pinkbike**
- **Pladform** - **Pladform**
- **Platzi**
- **PlatziCourse**
- **play.fm** - **play.fm**
- **PlayPlusTV** - **PlayPlusTV**
- **PlaysTV** - **PlaysTV**
@ -683,18 +667,16 @@
- **PopcornTV** - **PopcornTV**
- **PornCom** - **PornCom**
- **PornerBros** - **PornerBros**
- **PornFlip**
- **PornHd** - **PornHd**
- **PornHub**: PornHub and Thumbzilla - **PornHub**: PornHub and Thumbzilla
- **PornHubPlaylist** - **PornHubPagedVideoList**
- **PornHubUserVideos** - **PornHubUser**
- **PornHubUserVideosUpload**
- **Pornotube** - **Pornotube**
- **PornoVoisines** - **PornoVoisines**
- **PornoXO** - **PornoXO**
- **PornTube** - **PornTube**
- **PressTV** - **PressTV**
- **PrimeShareTV**
- **PromptFile**
- **prosiebensat1**: ProSiebenSat.1 Digital - **prosiebensat1**: ProSiebenSat.1 Digital
- **puhutv** - **puhutv**
- **puhutv:serie** - **puhutv:serie**
@ -713,7 +695,7 @@
- **radio.de** - **radio.de**
- **radiobremen** - **radiobremen**
- **radiocanada** - **radiocanada**
- **RadioCanadaAudioVideo** - **radiocanada:audiovideo**
- **radiofrance** - **radiofrance**
- **RadioJavan** - **RadioJavan**
- **Rai** - **Rai**
@ -725,6 +707,7 @@
- **RBMARadio** - **RBMARadio**
- **RDS**: RDS.ca - **RDS**: RDS.ca
- **RedBullTV** - **RedBullTV**
- **RedBullTVRrnContent**
- **Reddit** - **Reddit**
- **RedditR** - **RedditR**
- **RedTube** - **RedTube**
@ -734,8 +717,6 @@
- **Restudy** - **Restudy**
- **Reuters** - **Reuters**
- **ReverbNation** - **ReverbNation**
- **revision**
- **revision3:embed**
- **RICE** - **RICE**
- **RMCDecouverte** - **RMCDecouverte**
- **RockstarGames** - **RockstarGames**
@ -758,9 +739,7 @@
- **rtve.es:television** - **rtve.es:television**
- **RTVNH** - **RTVNH**
- **RTVS** - **RTVS**
- **Rudo**
- **RUHD** - **RUHD**
- **RulePorn**
- **rutube**: Rutube videos - **rutube**: Rutube videos
- **rutube:channel**: Rutube channels - **rutube:channel**: Rutube channels
- **rutube:embed**: Rutube embedded videos - **rutube:embed**: Rutube embedded videos
@ -774,6 +753,7 @@
- **safari:api** - **safari:api**
- **safari:course**: safaribooksonline.com online courses - **safari:course**: safaribooksonline.com online courses
- **SAKTV** - **SAKTV**
- **SaltTV**
- **Sapo**: SAPO Vídeos - **Sapo**: SAPO Vídeos
- **savefrom.net** - **savefrom.net**
- **SBS**: sbs.com.au - **SBS**: sbs.com.au
@ -781,11 +761,13 @@
- **screen.yahoo:search**: Yahoo screen search - **screen.yahoo:search**: Yahoo screen search
- **Screencast** - **Screencast**
- **ScreencastOMatic** - **ScreencastOMatic**
- **ScrippsNetworks**
- **scrippsnetworks:watch** - **scrippsnetworks:watch**
- **SCTE**
- **SCTECourse**
- **Seeker** - **Seeker**
- **SenateISVP** - **SenateISVP**
- **SendtoNews** - **SendtoNews**
- **ServingSys**
- **Servus** - **Servus**
- **Sexu** - **Sexu**
- **SeznamZpravy** - **SeznamZpravy**
@ -796,6 +778,7 @@
- **ShowRoomLive** - **ShowRoomLive**
- **Sina** - **Sina**
- **SkylineWebcams** - **SkylineWebcams**
- **SkyNews**
- **skynewsarabia:article** - **skynewsarabia:article**
- **skynewsarabia:video** - **skynewsarabia:video**
- **SkySports** - **SkySports**
@ -815,6 +798,7 @@
- **soundcloud:set** - **soundcloud:set**
- **soundcloud:trackstation** - **soundcloud:trackstation**
- **soundcloud:user** - **soundcloud:user**
- **SoundcloudEmbed**
- **soundgasm** - **soundgasm**
- **soundgasm:profile** - **soundgasm:profile**
- **southpark.cc.com** - **southpark.cc.com**
@ -823,6 +807,7 @@
- **southpark.nl** - **southpark.nl**
- **southparkstudios.dk** - **southparkstudios.dk**
- **SpankBang** - **SpankBang**
- **SpankBangPlaylist**
- **Spankwire** - **Spankwire**
- **Spiegel** - **Spiegel**
- **Spiegel:Article**: Articles on spiegel.de - **Spiegel:Article**: Articles on spiegel.de
@ -840,12 +825,14 @@
- **Steam** - **Steam**
- **Stitcher** - **Stitcher**
- **Streamable** - **Streamable**
- **Streamango**
- **streamcloud.eu** - **streamcloud.eu**
- **StreamCZ** - **StreamCZ**
- **StreetVoice** - **StreetVoice**
- **StretchInternet** - **StretchInternet**
- **stv:player**
- **SunPorno** - **SunPorno**
- **sverigesradio:episode**
- **sverigesradio:publication**
- **SVT** - **SVT**
- **SVTPage** - **SVTPage**
- **SVTPlay**: SVT Play and Öppet arkiv - **SVTPlay**: SVT Play and Öppet arkiv
@ -866,6 +853,7 @@
- **teachertube:user:collection**: teachertube.com user and collection videos - **teachertube:user:collection**: teachertube.com user and collection videos
- **TeachingChannel** - **TeachingChannel**
- **Teamcoco** - **Teamcoco**
- **TeamTreeHouse**
- **TechTalks** - **TechTalks**
- **techtv.mit.edu** - **techtv.mit.edu**
- **ted** - **ted**
@ -878,13 +866,14 @@
- **TeleQuebec** - **TeleQuebec**
- **TeleQuebecEmission** - **TeleQuebecEmission**
- **TeleQuebecLive** - **TeleQuebecLive**
- **TeleQuebecSquat**
- **TeleTask** - **TeleTask**
- **Telewebion** - **Telewebion**
- **TennisTV** - **TennisTV**
- **TenPlay**
- **TF1** - **TF1**
- **TFO** - **TFO**
- **TheIntercept** - **TheIntercept**
- **theoperaplatform**
- **ThePlatform** - **ThePlatform**
- **ThePlatformFeed** - **ThePlatformFeed**
- **TheScene** - **TheScene**
@ -909,6 +898,7 @@
- **ToypicsUser**: Toypics user profile - **ToypicsUser**: Toypics user profile
- **TrailerAddict** (Currently broken) - **TrailerAddict** (Currently broken)
- **Trilulilu** - **Trilulilu**
- **TruNews**
- **TruTV** - **TruTV**
- **Tube8** - **Tube8**
- **TubiTv** - **TubiTv**
@ -919,11 +909,12 @@
- **tunein:topic** - **tunein:topic**
- **TunePk** - **TunePk**
- **Turbo** - **Turbo**
- **Tutv**
- **tv.dfb.de** - **tv.dfb.de**
- **TV2** - **TV2**
- **tv2.hu** - **tv2.hu**
- **TV2Article** - **TV2Article**
- **TV2DK**
- **TV2DKBornholmPlay**
- **TV4**: tv4.se and tv4play.se - **TV4**: tv4.se and tv4play.se
- **TV5MondePlus**: TV5MONDE+ - **TV5MondePlus**: TV5MONDE+
- **TVA** - **TVA**
@ -960,10 +951,12 @@
- **twitch:vod** - **twitch:vod**
- **twitter** - **twitter**
- **twitter:amplify** - **twitter:amplify**
- **twitter:broadcast**
- **twitter:card** - **twitter:card**
- **udemy** - **udemy**
- **udemy:course** - **udemy:course**
- **UDNEmbed**: 聯合影音 - **UDNEmbed**: 聯合影音
- **UFCArabia**
- **UFCTV** - **UFCTV**
- **UKTVPlay** - **UKTVPlay**
- **umg:de**: Universal Music Deutschland - **umg:de**: Universal Music Deutschland
@ -984,7 +977,6 @@
- **Vbox7** - **Vbox7**
- **VeeHD** - **VeeHD**
- **Veoh** - **Veoh**
- **Vessel**
- **Vesti**: Вести.Ru - **Vesti**: Вести.Ru
- **Vevo** - **Vevo**
- **VevoPlaylist** - **VevoPlaylist**
@ -999,16 +991,12 @@
- **Viddler** - **Viddler**
- **Videa** - **Videa**
- **video.google:search**: Google Video search - **video.google:search**: Google Video search
- **video.mit.edu**
- **VideoDetective** - **VideoDetective**
- **videofy.me** - **videofy.me**
- **VideoMega**
- **videomore** - **videomore**
- **videomore:season** - **videomore:season**
- **videomore:video** - **videomore:video**
- **VideoPremium**
- **VideoPress** - **VideoPress**
- **videoweed**: VideoWeed
- **Vidio** - **Vidio**
- **VidLii** - **VidLii**
- **vidme** - **vidme**
@ -1019,7 +1007,6 @@
- **vier:videos** - **vier:videos**
- **ViewLift** - **ViewLift**
- **ViewLiftEmbed** - **ViewLiftEmbed**
- **Viewster**
- **Viidea** - **Viidea**
- **viki** - **viki**
- **viki:channel** - **viki:channel**
@ -1053,10 +1040,9 @@
- **Voot** - **Voot**
- **VoxMedia** - **VoxMedia**
- **VoxMediaVolume** - **VoxMediaVolume**
- **Vporn**
- **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl - **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **Vrak** - **Vrak**
- **VRT**: deredactie.be, sporza.be, cobra.be and cobra.canvas.be - **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza
- **VrtNU**: VrtNU.be - **VrtNU**: VrtNU.be
- **vrv** - **vrv**
- **vrv:series** - **vrv:series**
@ -1067,6 +1053,7 @@
- **VVVVID** - **VVVVID**
- **VyboryMos** - **VyboryMos**
- **Vzaar** - **Vzaar**
- **Wakanim**
- **Walla** - **Walla**
- **WalyTV** - **WalyTV**
- **washingtonpost** - **washingtonpost**
@ -1085,21 +1072,18 @@
- **Weibo** - **Weibo**
- **WeiboMobile** - **WeiboMobile**
- **WeiqiTV**: WQTV - **WeiqiTV**: WQTV
- **wholecloud**: WholeCloud
- **Wimp**
- **Wistia** - **Wistia**
- **wnl**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl - **wnl**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **WorldStarHipHop** - **WorldStarHipHop**
- **wrzuta.pl**
- **wrzuta.pl:playlist**
- **WSJ**: Wall Street Journal - **WSJ**: Wall Street Journal
- **WSJArticle** - **WSJArticle**
- **WWE** - **WWE**
- **XBef** - **XBef**
- **XboxClips** - **XboxClips**
- **XFileShare**: XFileShare based sites: DaClips, FileHoot, GorillaVid, MovPod, PowerWatch, Rapidvideo.ws, TheVideoBee, Vidto, Streamin.To, XVIDSTAGE, Vid ABC, VidBom, vidlo, RapidVideo.TV, FastVideo.me - **XFileShare**: XFileShare based sites: ClipWatching, GoUnlimited, GoVid, HolaVid, Streamty, TheVideoBee, Uqload, VidBom, vidlo, VidLocker, VidShare, VUp, XVideoSharing
- **XHamster** - **XHamster**
- **XHamsterEmbed** - **XHamsterEmbed**
- **XHamsterUser**
- **xiami:album**: 虾米音乐 - 专辑 - **xiami:album**: 虾米音乐 - 专辑
- **xiami:artist**: 虾米音乐 - 歌手 - **xiami:artist**: 虾米音乐 - 歌手
- **xiami:collection**: 虾米音乐 - 精选集 - **xiami:collection**: 虾米音乐 - 精选集
@ -1115,10 +1099,14 @@
- **XVideos** - **XVideos**
- **XXXYMovies** - **XXXYMovies**
- **Yahoo**: Yahoo screen and movies - **Yahoo**: Yahoo screen and movies
- **yahoo:gyao**
- **yahoo:gyao:player**
- **yahoo:japannews**: Yahoo! Japan News
- **YandexDisk** - **YandexDisk**
- **yandexmusic:album**: Яндекс.Музыка - Альбом - **yandexmusic:album**: Яндекс.Музыка - Альбом
- **yandexmusic:playlist**: Яндекс.Музыка - Плейлист - **yandexmusic:playlist**: Яндекс.Музыка - Плейлист
- **yandexmusic:track**: Яндекс.Музыка - Трек - **yandexmusic:track**: Яндекс.Музыка - Трек
- **YandexVideo**
- **YapFiles** - **YapFiles**
- **YesJapan** - **YesJapan**
- **yinyuetai:video**: 音悦Tai - **yinyuetai:video**: 音悦Tai

View File

@ -3,4 +3,4 @@ universal = True
[flake8] [flake8]
exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git,venv exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git,venv
ignore = E402,E501,E731,E741 ignore = E402,E501,E731,E741,W503

View File

@ -104,7 +104,7 @@ setup(
version=__version__, version=__version__,
description=DESCRIPTION, description=DESCRIPTION,
long_description=LONG_DESCRIPTION, long_description=LONG_DESCRIPTION,
url='https://github.com/rg3/youtube-dl', url='https://github.com/ytdl-org/youtube-dl',
author='Ricardo Garcia', author='Ricardo Garcia',
author_email='ytdl@yt-dl.org', author_email='ytdl@yt-dl.org',
maintainer='Sergey M.', maintainer='Sergey M.',

View File

@ -61,6 +61,7 @@ class TestInfoExtractor(unittest.TestCase):
<meta content='Foo' property=og:foobar> <meta content='Foo' property=og:foobar>
<meta name="og:test1" content='foo > < bar'/> <meta name="og:test1" content='foo > < bar'/>
<meta name="og:test2" content="foo >//< bar"/> <meta name="og:test2" content="foo >//< bar"/>
<meta property=og-test3 content='Ill-formatted opengraph'/>
''' '''
self.assertEqual(ie._og_search_title(html), 'Foo') self.assertEqual(ie._og_search_title(html), 'Foo')
self.assertEqual(ie._og_search_description(html), 'Some video\'s description ') self.assertEqual(ie._og_search_description(html), 'Some video\'s description ')
@ -69,6 +70,7 @@ class TestInfoExtractor(unittest.TestCase):
self.assertEqual(ie._og_search_property('foobar', html), 'Foo') self.assertEqual(ie._og_search_property('foobar', html), 'Foo')
self.assertEqual(ie._og_search_property('test1', html), 'foo > < bar') self.assertEqual(ie._og_search_property('test1', html), 'foo > < bar')
self.assertEqual(ie._og_search_property('test2', html), 'foo >//< bar') self.assertEqual(ie._og_search_property('test2', html), 'foo >//< bar')
self.assertEqual(ie._og_search_property('test3', html), 'Ill-formatted opengraph')
self.assertEqual(ie._og_search_property(('test0', 'test1'), html), 'foo > < bar') self.assertEqual(ie._og_search_property(('test0', 'test1'), html), 'foo > < bar')
self.assertRaises(RegexNotFoundError, ie._og_search_property, 'test0', html, None, fatal=True) self.assertRaises(RegexNotFoundError, ie._og_search_property, 'test0', html, None, fatal=True)
self.assertRaises(RegexNotFoundError, ie._og_search_property, ('test0', 'test00'), html, None, fatal=True) self.assertRaises(RegexNotFoundError, ie._og_search_property, ('test0', 'test00'), html, None, fatal=True)
@ -105,6 +107,184 @@ class TestInfoExtractor(unittest.TestCase):
self.assertRaises(ExtractorError, self.ie._download_json, uri, None) self.assertRaises(ExtractorError, self.ie._download_json, uri, None)
self.assertEqual(self.ie._download_json(uri, None, fatal=False), None) self.assertEqual(self.ie._download_json(uri, None, fatal=False), None)
def test_parse_html5_media_entries(self):
# from https://www.r18.com/
# with kpbs in label
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://www.r18.com/',
r'''
<video id="samplevideo_amateur" class="js-samplevideo video-js vjs-default-skin vjs-big-play-centered" controls preload="auto" width="400" height="225" poster="//pics.r18.com/digital/amateur/mgmr105/mgmr105jp.jpg">
<source id="video_source" src="https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_sm_w.mp4" type="video/mp4" res="240" label="300kbps">
<source id="video_source" src="https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_dm_w.mp4" type="video/mp4" res="480" label="1000kbps">
<source id="video_source" src="https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_dmb_w.mp4" type="video/mp4" res="740" label="1500kbps">
<p>Your browser does not support the video tag.</p>
</video>
''', None)[0],
{
'formats': [{
'url': 'https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_sm_w.mp4',
'ext': 'mp4',
'format_id': '300kbps',
'height': 240,
'tbr': 300,
}, {
'url': 'https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_dm_w.mp4',
'ext': 'mp4',
'format_id': '1000kbps',
'height': 480,
'tbr': 1000,
}, {
'url': 'https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_dmb_w.mp4',
'ext': 'mp4',
'format_id': '1500kbps',
'height': 740,
'tbr': 1500,
}],
'thumbnail': '//pics.r18.com/digital/amateur/mgmr105/mgmr105jp.jpg'
})
# from https://www.csfd.cz/
# with width and height
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://www.csfd.cz/',
r'''
<video width="770" height="328" preload="none" controls poster="https://img.csfd.cz/files/images/film/video/preview/163/344/163344118_748d20.png?h360" >
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327358_eac647.mp4" type="video/mp4" width="640" height="360">
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327360_3d2646.mp4" type="video/mp4" width="1280" height="720">
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327356_91f258.mp4" type="video/mp4" width="1920" height="1080">
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327359_962b4a.webm" type="video/webm" width="640" height="360">
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327361_6feee0.webm" type="video/webm" width="1280" height="720">
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327357_8ab472.webm" type="video/webm" width="1920" height="1080">
<track src="https://video.csfd.cz/files/subtitles/163/344/163344115_4c388b.srt" type="text/x-srt" kind="subtitles" srclang="cs" label="cs">
</video>
''', None)[0],
{
'formats': [{
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327358_eac647.mp4',
'ext': 'mp4',
'width': 640,
'height': 360,
}, {
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327360_3d2646.mp4',
'ext': 'mp4',
'width': 1280,
'height': 720,
}, {
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327356_91f258.mp4',
'ext': 'mp4',
'width': 1920,
'height': 1080,
}, {
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327359_962b4a.webm',
'ext': 'webm',
'width': 640,
'height': 360,
}, {
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327361_6feee0.webm',
'ext': 'webm',
'width': 1280,
'height': 720,
}, {
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327357_8ab472.webm',
'ext': 'webm',
'width': 1920,
'height': 1080,
}],
'subtitles': {
'cs': [{'url': 'https://video.csfd.cz/files/subtitles/163/344/163344115_4c388b.srt'}]
},
'thumbnail': 'https://img.csfd.cz/files/images/film/video/preview/163/344/163344118_748d20.png?h360'
})
# from https://tamasha.com/v/Kkdjw
# with height in label
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://tamasha.com/v/Kkdjw',
r'''
<video crossorigin="anonymous">
<source src="https://s-v2.tamasha.com/statics/videos_file/19/8f/Kkdjw_198feff8577d0057536e905cce1fb61438dd64e0_n_240.mp4" type="video/mp4" label="AUTO" res="0"/>
<source src="https://s-v2.tamasha.com/statics/videos_file/19/8f/Kkdjw_198feff8577d0057536e905cce1fb61438dd64e0_n_240.mp4" type="video/mp4"
label="240p" res="240"/>
<source src="https://s-v2.tamasha.com/statics/videos_file/20/00/Kkdjw_200041c66f657fc967db464d156eafbc1ed9fe6f_n_144.mp4" type="video/mp4"
label="144p" res="144"/>
</video>
''', None)[0],
{
'formats': [{
'url': 'https://s-v2.tamasha.com/statics/videos_file/19/8f/Kkdjw_198feff8577d0057536e905cce1fb61438dd64e0_n_240.mp4',
}, {
'url': 'https://s-v2.tamasha.com/statics/videos_file/19/8f/Kkdjw_198feff8577d0057536e905cce1fb61438dd64e0_n_240.mp4',
'ext': 'mp4',
'format_id': '240p',
'height': 240,
}, {
'url': 'https://s-v2.tamasha.com/statics/videos_file/20/00/Kkdjw_200041c66f657fc967db464d156eafbc1ed9fe6f_n_144.mp4',
'ext': 'mp4',
'format_id': '144p',
'height': 144,
}]
})
# from https://www.directvnow.com
# with data-src
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://www.directvnow.com',
r'''
<video id="vid1" class="header--video-masked active" muted playsinline>
<source data-src="https://cdn.directv.com/content/dam/dtv/prod/website_directvnow-international/videos/DTVN_hdr_HBO_v3.mp4" type="video/mp4" />
</video>
''', None)[0],
{
'formats': [{
'ext': 'mp4',
'url': 'https://cdn.directv.com/content/dam/dtv/prod/website_directvnow-international/videos/DTVN_hdr_HBO_v3.mp4',
}]
})
# from https://www.directvnow.com
# with data-src
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://www.directvnow.com',
r'''
<video id="vid1" class="header--video-masked active" muted playsinline>
<source data-src="https://cdn.directv.com/content/dam/dtv/prod/website_directvnow-international/videos/DTVN_hdr_HBO_v3.mp4" type="video/mp4" />
</video>
''', None)[0],
{
'formats': [{
'url': 'https://cdn.directv.com/content/dam/dtv/prod/website_directvnow-international/videos/DTVN_hdr_HBO_v3.mp4',
'ext': 'mp4',
}]
})
# from https://www.klarna.com/uk/
# with data-video-src
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://www.directvnow.com',
r'''
<video loop autoplay muted class="responsive-video block-kl__video video-on-medium">
<source src="" data-video-desktop data-video-src="https://www.klarna.com/uk/wp-content/uploads/sites/11/2019/01/KL062_Smooth3_0_DogWalking_5s_920x080_.mp4" type="video/mp4" />
</video>
''', None)[0],
{
'formats': [{
'url': 'https://www.klarna.com/uk/wp-content/uploads/sites/11/2019/01/KL062_Smooth3_0_DogWalking_5s_920x080_.mp4',
'ext': 'mp4',
}],
})
def test_extract_jwplayer_data_realworld(self): def test_extract_jwplayer_data_realworld(self):
# from http://www.suffolk.edu/sjc/ # from http://www.suffolk.edu/sjc/
expect_dict( expect_dict(
@ -199,7 +379,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
def test_parse_m3u8_formats(self): def test_parse_m3u8_formats(self):
_TEST_CASES = [ _TEST_CASES = [
( (
# https://github.com/rg3/youtube-dl/issues/11507 # https://github.com/ytdl-org/youtube-dl/issues/11507
# http://pluzz.francetv.fr/videos/le_ministere.html # http://pluzz.francetv.fr/videos/le_ministere.html
'pluzz_francetv_11507', 'pluzz_francetv_11507',
'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/master.m3u8?caption=2017%2F16%2F156589847-1492488987.m3u8%3Afra%3AFrancais&audiotrack=0%3Afra%3AFrancais', 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/master.m3u8?caption=2017%2F16%2F156589847-1492488987.m3u8%3Afra%3AFrancais&audiotrack=0%3Afra%3AFrancais',
@ -261,7 +441,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
}] }]
), ),
( (
# https://github.com/rg3/youtube-dl/issues/11995 # https://github.com/ytdl-org/youtube-dl/issues/11995
# http://teamcoco.com/video/clueless-gamer-super-bowl-for-honor # http://teamcoco.com/video/clueless-gamer-super-bowl-for-honor
'teamcoco_11995', 'teamcoco_11995',
'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8', 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
@ -335,7 +515,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
}] }]
), ),
( (
# https://github.com/rg3/youtube-dl/issues/12211 # https://github.com/ytdl-org/youtube-dl/issues/12211
# http://video.toggle.sg/en/series/whoopie-s-world/ep3/478601 # http://video.toggle.sg/en/series/whoopie-s-world/ep3/478601
'toggle_mobile_12211', 'toggle_mobile_12211',
'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8', 'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8',
@ -497,7 +677,64 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'width': 1280, 'width': 1280,
'height': 720, 'height': 720,
}] }]
) ),
(
# https://github.com/ytdl-org/youtube-dl/issues/18923
# https://www.ted.com/talks/boris_hesser_a_grassroots_healthcare_revolution_in_africa
'ted_18923',
'http://hls.ted.com/talks/31241.m3u8',
[{
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/audio/600k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '600k-Audio',
'vcodec': 'none',
}, {
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/audio/600k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '68',
'vcodec': 'none',
}, {
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/video/64k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '163',
'acodec': 'none',
'width': 320,
'height': 180,
}, {
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/video/180k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '481',
'acodec': 'none',
'width': 512,
'height': 288,
}, {
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/video/320k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '769',
'acodec': 'none',
'width': 512,
'height': 288,
}, {
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/video/450k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '984',
'acodec': 'none',
'width': 512,
'height': 288,
}, {
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/video/600k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '1255',
'acodec': 'none',
'width': 640,
'height': 360,
}, {
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/video/950k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '1693',
'acodec': 'none',
'width': 853,
'height': 480,
}, {
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/video/1500k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '2462',
'acodec': 'none',
'width': 1280,
'height': 720,
}]
),
] ]
for m3u8_file, m3u8_url, expected_formats in _TEST_CASES: for m3u8_file, m3u8_url, expected_formats in _TEST_CASES:
@ -511,11 +748,12 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
def test_parse_mpd_formats(self): def test_parse_mpd_formats(self):
_TEST_CASES = [ _TEST_CASES = [
( (
# https://github.com/rg3/youtube-dl/issues/13919 # https://github.com/ytdl-org/youtube-dl/issues/13919
# Also tests duplicate representation ids, see # Also tests duplicate representation ids, see
# https://github.com/rg3/youtube-dl/issues/15111 # https://github.com/ytdl-org/youtube-dl/issues/15111
'float_duration', 'float_duration',
'http://unknown/manifest.mpd', 'http://unknown/manifest.mpd', # mpd_url
None, # mpd_base_url
[{ [{
'manifest_url': 'http://unknown/manifest.mpd', 'manifest_url': 'http://unknown/manifest.mpd',
'ext': 'm4a', 'ext': 'm4a',
@ -593,9 +831,10 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 1080, 'height': 1080,
}] }]
), ( ), (
# https://github.com/rg3/youtube-dl/pull/14844 # https://github.com/ytdl-org/youtube-dl/pull/14844
'urls_only', 'urls_only',
'http://unknown/manifest.mpd', 'http://unknown/manifest.mpd', # mpd_url
None, # mpd_base_url
[{ [{
'manifest_url': 'http://unknown/manifest.mpd', 'manifest_url': 'http://unknown/manifest.mpd',
'ext': 'mp4', 'ext': 'mp4',
@ -674,22 +913,68 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'width': 1920, 'width': 1920,
'height': 1080, 'height': 1080,
}] }]
), (
# https://github.com/ytdl-org/youtube-dl/issues/20346
# Media considered unfragmented even though it contains
# Initialization tag
'unfragmented',
'https://v.redd.it/hw1x7rcg7zl21/DASHPlaylist.mpd', # mpd_url
'https://v.redd.it/hw1x7rcg7zl21', # mpd_base_url
[{
'url': 'https://v.redd.it/hw1x7rcg7zl21/audio',
'manifest_url': 'https://v.redd.it/hw1x7rcg7zl21/DASHPlaylist.mpd',
'ext': 'm4a',
'format_id': 'AUDIO-1',
'format_note': 'DASH audio',
'container': 'm4a_dash',
'acodec': 'mp4a.40.2',
'vcodec': 'none',
'tbr': 129.87,
'asr': 48000,
}, {
'url': 'https://v.redd.it/hw1x7rcg7zl21/DASH_240',
'manifest_url': 'https://v.redd.it/hw1x7rcg7zl21/DASHPlaylist.mpd',
'ext': 'mp4',
'format_id': 'VIDEO-2',
'format_note': 'DASH video',
'container': 'mp4_dash',
'acodec': 'none',
'vcodec': 'avc1.4d401e',
'tbr': 608.0,
'width': 240,
'height': 240,
'fps': 30,
}, {
'url': 'https://v.redd.it/hw1x7rcg7zl21/DASH_360',
'manifest_url': 'https://v.redd.it/hw1x7rcg7zl21/DASHPlaylist.mpd',
'ext': 'mp4',
'format_id': 'VIDEO-1',
'format_note': 'DASH video',
'container': 'mp4_dash',
'acodec': 'none',
'vcodec': 'avc1.4d401e',
'tbr': 804.261,
'width': 360,
'height': 360,
'fps': 30,
}]
) )
] ]
for mpd_file, mpd_url, expected_formats in _TEST_CASES: for mpd_file, mpd_url, mpd_base_url, expected_formats in _TEST_CASES:
with io.open('./test/testdata/mpd/%s.mpd' % mpd_file, with io.open('./test/testdata/mpd/%s.mpd' % mpd_file,
mode='r', encoding='utf-8') as f: mode='r', encoding='utf-8') as f:
formats = self.ie._parse_mpd_formats( formats = self.ie._parse_mpd_formats(
compat_etree_fromstring(f.read().encode('utf-8')), compat_etree_fromstring(f.read().encode('utf-8')),
mpd_url=mpd_url) mpd_base_url=mpd_base_url, mpd_url=mpd_url)
self.ie._sort_formats(formats) self.ie._sort_formats(formats)
expect_value(self, formats, expected_formats, None) expect_value(self, formats, expected_formats, None)
def test_parse_f4m_formats(self): def test_parse_f4m_formats(self):
_TEST_CASES = [ _TEST_CASES = [
( (
# https://github.com/rg3/youtube-dl/issues/14660 # https://github.com/ytdl-org/youtube-dl/issues/14660
'custom_base_url', 'custom_base_url',
'http://api.new.livestream.com/accounts/6115179/events/6764928/videos/144884262.f4m', 'http://api.new.livestream.com/accounts/6115179/events/6764928/videos/144884262.f4m',
[{ [{

View File

@ -239,6 +239,76 @@ class TestFormatSelection(unittest.TestCase):
downloaded = ydl.downloaded_info_dicts[0] downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'vid-vcodec-dot') self.assertEqual(downloaded['format_id'], 'vid-vcodec-dot')
def test_format_selection_string_ops(self):
formats = [
{'format_id': 'abc-cba', 'ext': 'mp4', 'url': TEST_URL},
{'format_id': 'zxc-cxz', 'ext': 'webm', 'url': TEST_URL},
]
info_dict = _make_result(formats)
# equals (=)
ydl = YDL({'format': '[format_id=abc-cba]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'abc-cba')
# does not equal (!=)
ydl = YDL({'format': '[format_id!=abc-cba]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'zxc-cxz')
ydl = YDL({'format': '[format_id!=abc-cba][format_id!=zxc-cxz]'})
self.assertRaises(ExtractorError, ydl.process_ie_result, info_dict.copy())
# starts with (^=)
ydl = YDL({'format': '[format_id^=abc]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'abc-cba')
# does not start with (!^=)
ydl = YDL({'format': '[format_id!^=abc]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'zxc-cxz')
ydl = YDL({'format': '[format_id!^=abc][format_id!^=zxc]'})
self.assertRaises(ExtractorError, ydl.process_ie_result, info_dict.copy())
# ends with ($=)
ydl = YDL({'format': '[format_id$=cba]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'abc-cba')
# does not end with (!$=)
ydl = YDL({'format': '[format_id!$=cba]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'zxc-cxz')
ydl = YDL({'format': '[format_id!$=cba][format_id!$=cxz]'})
self.assertRaises(ExtractorError, ydl.process_ie_result, info_dict.copy())
# contains (*=)
ydl = YDL({'format': '[format_id*=bc-cb]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'abc-cba')
# does not contain (!*=)
ydl = YDL({'format': '[format_id!*=bc-cb]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'zxc-cxz')
ydl = YDL({'format': '[format_id!*=abc][format_id!*=zxc]'})
self.assertRaises(ExtractorError, ydl.process_ie_result, info_dict.copy())
ydl = YDL({'format': '[format_id!*=-]'})
self.assertRaises(ExtractorError, ydl.process_ie_result, info_dict.copy())
def test_youtube_format_selection(self): def test_youtube_format_selection(self):
order = [ order = [
'38', '37', '46', '22', '45', '35', '44', '18', '34', '43', '6', '5', '17', '36', '13', '38', '37', '46', '22', '45', '35', '44', '18', '34', '43', '6', '5', '17', '36', '13',
@ -341,7 +411,7 @@ class TestFormatSelection(unittest.TestCase):
# For extractors with incomplete formats (all formats are audio-only or # For extractors with incomplete formats (all formats are audio-only or
# video-only) best and worst should fallback to corresponding best/worst # video-only) best and worst should fallback to corresponding best/worst
# video-only or audio-only formats (as per # video-only or audio-only formats (as per
# https://github.com/rg3/youtube-dl/pull/5556) # https://github.com/ytdl-org/youtube-dl/pull/5556)
formats = [ formats = [
{'format_id': 'low', 'ext': 'mp3', 'preference': 1, 'vcodec': 'none', 'url': TEST_URL}, {'format_id': 'low', 'ext': 'mp3', 'preference': 1, 'vcodec': 'none', 'url': TEST_URL},
{'format_id': 'high', 'ext': 'mp3', 'preference': 2, 'vcodec': 'none', 'url': TEST_URL}, {'format_id': 'high', 'ext': 'mp3', 'preference': 2, 'vcodec': 'none', 'url': TEST_URL},
@ -372,7 +442,7 @@ class TestFormatSelection(unittest.TestCase):
self.assertRaises(ExtractorError, ydl.process_ie_result, info_dict.copy()) self.assertRaises(ExtractorError, ydl.process_ie_result, info_dict.copy())
def test_format_selection_issue_10083(self): def test_format_selection_issue_10083(self):
# See https://github.com/rg3/youtube-dl/issues/10083 # See https://github.com/ytdl-org/youtube-dl/issues/10083
formats = [ formats = [
{'format_id': 'regular', 'height': 360, 'url': TEST_URL}, {'format_id': 'regular', 'height': 360, 'url': TEST_URL},
{'format_id': 'video', 'height': 720, 'acodec': 'none', 'url': TEST_URL}, {'format_id': 'video', 'height': 720, 'acodec': 'none', 'url': TEST_URL},
@ -783,7 +853,7 @@ class TestYoutubeDL(unittest.TestCase):
self.assertEqual(result, [2, 3, 4]) self.assertEqual(result, [2, 3, 4])
def test_urlopen_no_file_protocol(self): def test_urlopen_no_file_protocol(self):
# see https://github.com/rg3/youtube-dl/issues/8227 # see https://github.com/ytdl-org/youtube-dl/issues/8227
ydl = YDL() ydl = YDL()
self.assertRaises(compat_urllib_error.URLError, ydl.urlopen, 'file:///etc/passwd') self.assertRaises(compat_urllib_error.URLError, ydl.urlopen, 'file:///etc/passwd')

View File

@ -29,6 +29,16 @@ class TestYoutubeDLCookieJar(unittest.TestCase):
tf.close() tf.close()
os.remove(tf.name) os.remove(tf.name)
def test_strip_httponly_prefix(self):
cookiejar = YoutubeDLCookieJar('./test/testdata/cookies/httponly_cookies.txt')
cookiejar.load(ignore_discard=True, ignore_expires=True)
def assert_cookie_has_value(key):
self.assertEqual(cookiejar._cookies['www.foobar.foobar']['/'][key].value, key + '_VALUE')
assert_cookie_has_value('HTTPONLY_COOKIE')
assert_cookie_has_value('JS_ACCESSIBLE_COOKIE')
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@ -44,16 +44,16 @@ class TestAES(unittest.TestCase):
def test_decrypt_text(self): def test_decrypt_text(self):
password = intlist_to_bytes(self.key).decode('utf-8') password = intlist_to_bytes(self.key).decode('utf-8')
encrypted = base64.b64encode( encrypted = base64.b64encode(
intlist_to_bytes(self.iv[:8]) + intlist_to_bytes(self.iv[:8])
b'\x17\x15\x93\xab\x8d\x80V\xcdV\xe0\t\xcdo\xc2\xa5\xd8ksM\r\xe27N\xae' + b'\x17\x15\x93\xab\x8d\x80V\xcdV\xe0\t\xcdo\xc2\xa5\xd8ksM\r\xe27N\xae'
).decode('utf-8') ).decode('utf-8')
decrypted = (aes_decrypt_text(encrypted, password, 16)) decrypted = (aes_decrypt_text(encrypted, password, 16))
self.assertEqual(decrypted, self.secret_msg) self.assertEqual(decrypted, self.secret_msg)
password = intlist_to_bytes(self.key).decode('utf-8') password = intlist_to_bytes(self.key).decode('utf-8')
encrypted = base64.b64encode( encrypted = base64.b64encode(
intlist_to_bytes(self.iv[:8]) + intlist_to_bytes(self.iv[:8])
b'\x0b\xe6\xa4\xd9z\x0e\xb8\xb9\xd0\xd4i_\x85\x1d\x99\x98_\xe5\x80\xe7.\xbf\xa5\x83' + b'\x0b\xe6\xa4\xd9z\x0e\xb8\xb9\xd0\xd4i_\x85\x1d\x99\x98_\xe5\x80\xe7.\xbf\xa5\x83'
).decode('utf-8') ).decode('utf-8')
decrypted = (aes_decrypt_text(encrypted, password, 32)) decrypted = (aes_decrypt_text(encrypted, password, 32))
self.assertEqual(decrypted, self.secret_msg) self.assertEqual(decrypted, self.secret_msg)

View File

@ -110,7 +110,7 @@ class TestAllURLsMatching(unittest.TestCase):
self.assertMatch('https://vimeo.com/user7108434/videos', ['vimeo:user']) self.assertMatch('https://vimeo.com/user7108434/videos', ['vimeo:user'])
self.assertMatch('https://vimeo.com/user21297594/review/75524534/3c257a1b5d', ['vimeo:review']) self.assertMatch('https://vimeo.com/user21297594/review/75524534/3c257a1b5d', ['vimeo:review'])
# https://github.com/rg3/youtube-dl/issues/1930 # https://github.com/ytdl-org/youtube-dl/issues/1930
def test_soundcloud_not_matching_sets(self): def test_soundcloud_not_matching_sets(self):
self.assertMatch('http://soundcloud.com/floex/sets/gone-ep', ['soundcloud:set']) self.assertMatch('http://soundcloud.com/floex/sets/gone-ep', ['soundcloud:set'])
@ -119,16 +119,10 @@ class TestAllURLsMatching(unittest.TestCase):
self.assertMatch('http://tatianamaslanydaily.tumblr.com/post/54196191430', ['Tumblr']) self.assertMatch('http://tatianamaslanydaily.tumblr.com/post/54196191430', ['Tumblr'])
def test_pbs(self): def test_pbs(self):
# https://github.com/rg3/youtube-dl/issues/2350 # https://github.com/ytdl-org/youtube-dl/issues/2350
self.assertMatch('http://video.pbs.org/viralplayer/2365173446/', ['pbs']) self.assertMatch('http://video.pbs.org/viralplayer/2365173446/', ['pbs'])
self.assertMatch('http://video.pbs.org/widget/partnerplayer/980042464/', ['pbs']) self.assertMatch('http://video.pbs.org/widget/partnerplayer/980042464/', ['pbs'])
def test_yahoo_https(self):
# https://github.com/rg3/youtube-dl/issues/2701
self.assertMatch(
'https://screen.yahoo.com/smartwatches-latest-wearable-gadgets-163745379-cbs.html',
['Yahoo'])
def test_no_duplicated_ie_names(self): def test_no_duplicated_ie_names(self):
name_accu = collections.defaultdict(list) name_accu = collections.defaultdict(list)
for ie in self.ies: for ie in self.ies:

View File

@ -13,6 +13,7 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.compat import ( from youtube_dl.compat import (
compat_getenv, compat_getenv,
compat_setenv, compat_setenv,
compat_etree_Element,
compat_etree_fromstring, compat_etree_fromstring,
compat_expanduser, compat_expanduser,
compat_shlex_split, compat_shlex_split,
@ -90,6 +91,12 @@ class TestCompat(unittest.TestCase):
self.assertEqual(compat_shlex_split('-option "one\ntwo" \n -flag'), ['-option', 'one\ntwo', '-flag']) self.assertEqual(compat_shlex_split('-option "one\ntwo" \n -flag'), ['-option', 'one\ntwo', '-flag'])
self.assertEqual(compat_shlex_split('-val 中文'), ['-val', '中文']) self.assertEqual(compat_shlex_split('-val 中文'), ['-val', '中文'])
def test_compat_etree_Element(self):
try:
compat_etree_Element.items
except AttributeError:
self.fail('compat_etree_Element is not a type')
def test_compat_etree_fromstring(self): def test_compat_etree_fromstring(self):
xml = ''' xml = '''
<root foo="bar" spam="中文"> <root foo="bar" spam="中文">

View File

@ -34,8 +34,8 @@ def _make_testfunc(testfile):
def test_func(self): def test_func(self):
as_file = os.path.join(TEST_DIR, testfile) as_file = os.path.join(TEST_DIR, testfile)
swf_file = os.path.join(TEST_DIR, test_id + '.swf') swf_file = os.path.join(TEST_DIR, test_id + '.swf')
if ((not os.path.exists(swf_file)) or if ((not os.path.exists(swf_file))
os.path.getmtime(swf_file) < os.path.getmtime(as_file)): or os.path.getmtime(swf_file) < os.path.getmtime(as_file)):
# Recompile # Recompile
try: try:
subprocess.check_call([ subprocess.check_call([

View File

@ -19,6 +19,7 @@ from youtube_dl.utils import (
age_restricted, age_restricted,
args_to_str, args_to_str,
encode_base_n, encode_base_n,
caesar,
clean_html, clean_html,
date_from_str, date_from_str,
DateRange, DateRange,
@ -33,11 +34,13 @@ from youtube_dl.utils import (
ExtractorError, ExtractorError,
find_xpath_attr, find_xpath_attr,
fix_xml_ampersands, fix_xml_ampersands,
float_or_none,
get_element_by_class, get_element_by_class,
get_element_by_attribute, get_element_by_attribute,
get_elements_by_class, get_elements_by_class,
get_elements_by_attribute, get_elements_by_attribute,
InAdvancePagedList, InAdvancePagedList,
int_or_none,
intlist_to_bytes, intlist_to_bytes,
is_html, is_html,
js_to_json, js_to_json,
@ -55,6 +58,7 @@ from youtube_dl.utils import (
parse_count, parse_count,
parse_iso8601, parse_iso8601,
parse_resolution, parse_resolution,
parse_bitrate,
pkcs1pad, pkcs1pad,
read_batch_urls, read_batch_urls,
sanitize_filename, sanitize_filename,
@ -66,10 +70,13 @@ from youtube_dl.utils import (
remove_start, remove_start,
remove_end, remove_end,
remove_quotes, remove_quotes,
rot47,
shell_quote, shell_quote,
smuggle_url, smuggle_url,
str_to_int, str_to_int,
strip_jsonp, strip_jsonp,
strip_or_none,
subtitles_filename,
timeconvert, timeconvert,
unescapeHTML, unescapeHTML,
unified_strdate, unified_strdate,
@ -180,7 +187,7 @@ class TestUtil(unittest.TestCase):
self.assertEqual(sanitize_filename( self.assertEqual(sanitize_filename(
'ÂÃÄÀÁÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖŐØŒÙÚÛÜŰÝÞßàáâãäåæçèéêëìíîïðñòóôõöőøœùúûüűýþÿ', restricted=True), 'ÂÃÄÀÁÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖŐØŒÙÚÛÜŰÝÞßàáâãäåæçèéêëìíîïðñòóôõöőøœùúûüűýþÿ', restricted=True),
'AAAAAAAECEEEEIIIIDNOOOOOOOOEUUUUUYPssaaaaaaaeceeeeiiiionooooooooeuuuuuypy') 'AAAAAAAECEEEEIIIIDNOOOOOOOOEUUUUUYTHssaaaaaaaeceeeeiiiionooooooooeuuuuuythy')
def test_sanitize_ids(self): def test_sanitize_ids(self):
self.assertEqual(sanitize_filename('_n_cd26wFpw', is_id=True), '_n_cd26wFpw') self.assertEqual(sanitize_filename('_n_cd26wFpw', is_id=True), '_n_cd26wFpw')
@ -257,6 +264,11 @@ class TestUtil(unittest.TestCase):
self.assertEqual(replace_extension('.abc', 'temp'), '.abc.temp') self.assertEqual(replace_extension('.abc', 'temp'), '.abc.temp')
self.assertEqual(replace_extension('.abc.ext', 'temp'), '.abc.temp') self.assertEqual(replace_extension('.abc.ext', 'temp'), '.abc.temp')
def test_subtitles_filename(self):
self.assertEqual(subtitles_filename('abc.ext', 'en', 'vtt'), 'abc.en.vtt')
self.assertEqual(subtitles_filename('abc.ext', 'en', 'vtt', 'ext'), 'abc.en.vtt')
self.assertEqual(subtitles_filename('abc.unexpected_ext', 'en', 'vtt', 'ext'), 'abc.unexpected_ext.en.vtt')
def test_remove_start(self): def test_remove_start(self):
self.assertEqual(remove_start(None, 'A - '), None) self.assertEqual(remove_start(None, 'A - '), None)
self.assertEqual(remove_start('A - B', 'A - '), 'B') self.assertEqual(remove_start('A - B', 'A - '), 'B')
@ -330,6 +342,8 @@ class TestUtil(unittest.TestCase):
self.assertEqual(unified_strdate('July 15th, 2013'), '20130715') self.assertEqual(unified_strdate('July 15th, 2013'), '20130715')
self.assertEqual(unified_strdate('September 1st, 2013'), '20130901') self.assertEqual(unified_strdate('September 1st, 2013'), '20130901')
self.assertEqual(unified_strdate('Sep 2nd, 2013'), '20130902') self.assertEqual(unified_strdate('Sep 2nd, 2013'), '20130902')
self.assertEqual(unified_strdate('November 3rd, 2019'), '20191103')
self.assertEqual(unified_strdate('October 23rd, 2005'), '20051023')
def test_unified_timestamps(self): def test_unified_timestamps(self):
self.assertEqual(unified_timestamp('December 21, 2010'), 1292889600) self.assertEqual(unified_timestamp('December 21, 2010'), 1292889600)
@ -467,9 +481,30 @@ class TestUtil(unittest.TestCase):
shell_quote(args), shell_quote(args),
"""ffmpeg -i 'ñ€ß'"'"'.mp4'""" if compat_os_name != 'nt' else '''ffmpeg -i "ñ€ß'.mp4"''') """ffmpeg -i 'ñ€ß'"'"'.mp4'""" if compat_os_name != 'nt' else '''ffmpeg -i "ñ€ß'.mp4"''')
def test_float_or_none(self):
self.assertEqual(float_or_none('42.42'), 42.42)
self.assertEqual(float_or_none('42'), 42.0)
self.assertEqual(float_or_none(''), None)
self.assertEqual(float_or_none(None), None)
self.assertEqual(float_or_none([]), None)
self.assertEqual(float_or_none(set()), None)
def test_int_or_none(self):
self.assertEqual(int_or_none('42'), 42)
self.assertEqual(int_or_none(''), None)
self.assertEqual(int_or_none(None), None)
self.assertEqual(int_or_none([]), None)
self.assertEqual(int_or_none(set()), None)
def test_str_to_int(self): def test_str_to_int(self):
self.assertEqual(str_to_int('123,456'), 123456) self.assertEqual(str_to_int('123,456'), 123456)
self.assertEqual(str_to_int('123.456'), 123456) self.assertEqual(str_to_int('123.456'), 123456)
self.assertEqual(str_to_int(523), 523)
# Python 3 has no long
if sys.version_info < (3, 0):
eval('self.assertEqual(str_to_int(123456L), 123456)')
self.assertEqual(str_to_int('noninteger'), None)
self.assertEqual(str_to_int([]), None)
def test_url_basename(self): def test_url_basename(self):
self.assertEqual(url_basename('http://foo.de/'), '') self.assertEqual(url_basename('http://foo.de/'), '')
@ -507,6 +542,8 @@ class TestUtil(unittest.TestCase):
self.assertEqual(urljoin('http://foo.de/', ''), None) self.assertEqual(urljoin('http://foo.de/', ''), None)
self.assertEqual(urljoin('http://foo.de/', ['foobar']), None) self.assertEqual(urljoin('http://foo.de/', ['foobar']), None)
self.assertEqual(urljoin('http://foo.de/a/b/c.txt', '.././../d.txt'), 'http://foo.de/d.txt') self.assertEqual(urljoin('http://foo.de/a/b/c.txt', '.././../d.txt'), 'http://foo.de/d.txt')
self.assertEqual(urljoin('http://foo.de/a/b/c.txt', 'rtmp://foo.de'), 'rtmp://foo.de')
self.assertEqual(urljoin(None, 'rtmp://foo.de'), 'rtmp://foo.de')
def test_url_or_none(self): def test_url_or_none(self):
self.assertEqual(url_or_none(None), None) self.assertEqual(url_or_none(None), None)
@ -732,6 +769,18 @@ class TestUtil(unittest.TestCase):
d = json.loads(stripped) d = json.loads(stripped)
self.assertEqual(d, {'status': 'success'}) self.assertEqual(d, {'status': 'success'})
def test_strip_or_none(self):
self.assertEqual(strip_or_none(' abc'), 'abc')
self.assertEqual(strip_or_none('abc '), 'abc')
self.assertEqual(strip_or_none(' abc '), 'abc')
self.assertEqual(strip_or_none('\tabc\t'), 'abc')
self.assertEqual(strip_or_none('\n\tabc\n\t'), 'abc')
self.assertEqual(strip_or_none('abc'), 'abc')
self.assertEqual(strip_or_none(''), '')
self.assertEqual(strip_or_none(None), None)
self.assertEqual(strip_or_none(42), None)
self.assertEqual(strip_or_none([]), None)
def test_uppercase_escape(self): def test_uppercase_escape(self):
self.assertEqual(uppercase_escape(''), '') self.assertEqual(uppercase_escape(''), '')
self.assertEqual(uppercase_escape('\\U0001d550'), '𝕐') self.assertEqual(uppercase_escape('\\U0001d550'), '𝕐')
@ -789,6 +838,15 @@ class TestUtil(unittest.TestCase):
'vcodec': 'av01.0.05M.08', 'vcodec': 'av01.0.05M.08',
'acodec': 'none', 'acodec': 'none',
}) })
self.assertEqual(parse_codecs('theora, vorbis'), {
'vcodec': 'theora',
'acodec': 'vorbis',
})
self.assertEqual(parse_codecs('unknownvcodec, unknownacodec'), {
'vcodec': 'unknownvcodec',
'acodec': 'unknownacodec',
})
self.assertEqual(parse_codecs('unknown'), {})
def test_escape_rfc3986(self): def test_escape_rfc3986(self):
reserved = "!*'();:@&=+$,/?#[]" reserved = "!*'();:@&=+$,/?#[]"
@ -1028,6 +1086,13 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_resolution('4k'), {'height': 2160}) self.assertEqual(parse_resolution('4k'), {'height': 2160})
self.assertEqual(parse_resolution('8K'), {'height': 4320}) self.assertEqual(parse_resolution('8K'), {'height': 4320})
def test_parse_bitrate(self):
self.assertEqual(parse_bitrate(None), None)
self.assertEqual(parse_bitrate(''), None)
self.assertEqual(parse_bitrate('300kbps'), 300)
self.assertEqual(parse_bitrate('1500kbps'), 1500)
self.assertEqual(parse_bitrate('300 kbps'), 300)
def test_version_tuple(self): def test_version_tuple(self):
self.assertEqual(version_tuple('1'), (1,)) self.assertEqual(version_tuple('1'), (1,))
self.assertEqual(version_tuple('10.23.344'), (10, 23, 344)) self.assertEqual(version_tuple('10.23.344'), (10, 23, 344))
@ -1312,6 +1377,20 @@ Line 1
self.assertRaises(ValueError, encode_base_n, 0, 70) self.assertRaises(ValueError, encode_base_n, 0, 70)
self.assertRaises(ValueError, encode_base_n, 0, 60, custom_table) self.assertRaises(ValueError, encode_base_n, 0, 60, custom_table)
def test_caesar(self):
self.assertEqual(caesar('ace', 'abcdef', 2), 'cea')
self.assertEqual(caesar('cea', 'abcdef', -2), 'ace')
self.assertEqual(caesar('ace', 'abcdef', -2), 'eac')
self.assertEqual(caesar('eac', 'abcdef', 2), 'ace')
self.assertEqual(caesar('ace', 'abcdef', 0), 'ace')
self.assertEqual(caesar('xyz', 'abcdef', 2), 'xyz')
self.assertEqual(caesar('abc', 'acegik', 2), 'ebg')
self.assertEqual(caesar('ebg', 'acegik', -2), 'abc')
def test_rot47(self):
self.assertEqual(rot47('youtube-dl'), r'J@FEF36\5=')
self.assertEqual(rot47('YOUTUBE-DL'), r'*~&%&qt\s{')
def test_urshift(self): def test_urshift(self):
self.assertEqual(urshift(3, 1), 1) self.assertEqual(urshift(3, 1), 1)
self.assertEqual(urshift(-3, 1), 2147483646) self.assertEqual(urshift(-3, 1), 2147483646)

View File

@ -0,0 +1,6 @@
# Netscape HTTP Cookie File
# http://curl.haxx.se/rfc/cookie_spec.html
# This is a generated file! Do not edit.
#HttpOnly_www.foobar.foobar FALSE / TRUE 2147483647 HTTPONLY_COOKIE HTTPONLY_COOKIE_VALUE
www.foobar.foobar FALSE / TRUE 2147483647 JS_ACCESSIBLE_COOKIE JS_ACCESSIBLE_COOKIE_VALUE

28
test/testdata/m3u8/ted_18923.m3u8 vendored Normal file
View File

@ -0,0 +1,28 @@
#EXTM3U
#EXT-X-VERSION:4
#EXT-X-STREAM-INF:AUDIO="600k",BANDWIDTH=1255659,PROGRAM-ID=1,CODECS="avc1.42c01e,mp4a.40.2",RESOLUTION=640x360
/videos/BorisHesser_2018S/video/600k.m3u8?nobumpers=true&uniqueId=76011e2b
#EXT-X-STREAM-INF:AUDIO="600k",BANDWIDTH=163154,PROGRAM-ID=1,CODECS="avc1.42c00c,mp4a.40.2",RESOLUTION=320x180
/videos/BorisHesser_2018S/video/64k.m3u8?nobumpers=true&uniqueId=76011e2b
#EXT-X-STREAM-INF:AUDIO="600k",BANDWIDTH=481701,PROGRAM-ID=1,CODECS="avc1.42c015,mp4a.40.2",RESOLUTION=512x288
/videos/BorisHesser_2018S/video/180k.m3u8?nobumpers=true&uniqueId=76011e2b
#EXT-X-STREAM-INF:AUDIO="600k",BANDWIDTH=769968,PROGRAM-ID=1,CODECS="avc1.42c015,mp4a.40.2",RESOLUTION=512x288
/videos/BorisHesser_2018S/video/320k.m3u8?nobumpers=true&uniqueId=76011e2b
#EXT-X-STREAM-INF:AUDIO="600k",BANDWIDTH=984037,PROGRAM-ID=1,CODECS="avc1.42c015,mp4a.40.2",RESOLUTION=512x288
/videos/BorisHesser_2018S/video/450k.m3u8?nobumpers=true&uniqueId=76011e2b
#EXT-X-STREAM-INF:AUDIO="600k",BANDWIDTH=1693925,PROGRAM-ID=1,CODECS="avc1.4d401f,mp4a.40.2",RESOLUTION=853x480
/videos/BorisHesser_2018S/video/950k.m3u8?nobumpers=true&uniqueId=76011e2b
#EXT-X-STREAM-INF:AUDIO="600k",BANDWIDTH=2462469,PROGRAM-ID=1,CODECS="avc1.640028,mp4a.40.2",RESOLUTION=1280x720
/videos/BorisHesser_2018S/video/1500k.m3u8?nobumpers=true&uniqueId=76011e2b
#EXT-X-STREAM-INF:AUDIO="600k",BANDWIDTH=68101,PROGRAM-ID=1,CODECS="mp4a.40.2",DEFAULT=YES
/videos/BorisHesser_2018S/audio/600k.m3u8?nobumpers=true&uniqueId=76011e2b
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=74298,PROGRAM-ID=1,CODECS="avc1.42c00c",RESOLUTION=320x180,URI="/videos/BorisHesser_2018S/video/64k_iframe.m3u8?nobumpers=true&uniqueId=76011e2b"
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=216200,PROGRAM-ID=1,CODECS="avc1.42c015",RESOLUTION=512x288,URI="/videos/BorisHesser_2018S/video/180k_iframe.m3u8?nobumpers=true&uniqueId=76011e2b"
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=304717,PROGRAM-ID=1,CODECS="avc1.42c015",RESOLUTION=512x288,URI="/videos/BorisHesser_2018S/video/320k_iframe.m3u8?nobumpers=true&uniqueId=76011e2b"
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=350933,PROGRAM-ID=1,CODECS="avc1.42c015",RESOLUTION=512x288,URI="/videos/BorisHesser_2018S/video/450k_iframe.m3u8?nobumpers=true&uniqueId=76011e2b"
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=495850,PROGRAM-ID=1,CODECS="avc1.42c01e",RESOLUTION=640x360,URI="/videos/BorisHesser_2018S/video/600k_iframe.m3u8?nobumpers=true&uniqueId=76011e2b"
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=810750,PROGRAM-ID=1,CODECS="avc1.4d401f",RESOLUTION=853x480,URI="/videos/BorisHesser_2018S/video/950k_iframe.m3u8?nobumpers=true&uniqueId=76011e2b"
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=1273700,PROGRAM-ID=1,CODECS="avc1.640028",RESOLUTION=1280x720,URI="/videos/BorisHesser_2018S/video/1500k_iframe.m3u8?nobumpers=true&uniqueId=76011e2b"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="600k",LANGUAGE="en",NAME="Audio",AUTOSELECT=YES,DEFAULT=YES,URI="/videos/BorisHesser_2018S/audio/600k.m3u8?nobumpers=true&uniqueId=76011e2b",BANDWIDTH=614400

28
test/testdata/mpd/unfragmented.mpd vendored Normal file
View File

@ -0,0 +1,28 @@
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<MPD mediaPresentationDuration="PT54.915S" minBufferTime="PT1.500S" profiles="urn:mpeg:dash:profile:isoff-on-demand:2011" type="static" xmlns="urn:mpeg:dash:schema:mpd:2011">
<Period duration="PT54.915S">
<AdaptationSet segmentAlignment="true" subsegmentAlignment="true" subsegmentStartsWithSAP="1">
<Representation bandwidth="804261" codecs="avc1.4d401e" frameRate="30" height="360" id="VIDEO-1" mimeType="video/mp4" startWithSAP="1" width="360">
<BaseURL>DASH_360</BaseURL>
<SegmentBase indexRange="915-1114" indexRangeExact="true">
<Initialization range="0-914"/>
</SegmentBase>
</Representation>
<Representation bandwidth="608000" codecs="avc1.4d401e" frameRate="30" height="240" id="VIDEO-2" mimeType="video/mp4" startWithSAP="1" width="240">
<BaseURL>DASH_240</BaseURL>
<SegmentBase indexRange="913-1112" indexRangeExact="true">
<Initialization range="0-912"/>
</SegmentBase>
</Representation>
</AdaptationSet>
<AdaptationSet>
<Representation audioSamplingRate="48000" bandwidth="129870" codecs="mp4a.40.2" id="AUDIO-1" mimeType="audio/mp4" startWithSAP="1">
<AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="2"/>
<BaseURL>audio</BaseURL>
<SegmentBase indexRange="832-1007" indexRangeExact="true">
<Initialization range="0-831"/>
</SegmentBase>
</Representation>
</AdaptationSet>
</Period>
</MPD>

View File

@ -7,7 +7,7 @@
# https://github.com/zsh-users/antigen # https://github.com/zsh-users/antigen
# Install youtube-dl: # Install youtube-dl:
# antigen bundle rg3/youtube-dl # antigen bundle ytdl-org/youtube-dl
# Bundles installed by antigen are available for use immediately. # Bundles installed by antigen are available for use immediately.
# Update youtube-dl (and all other antigen bundles): # Update youtube-dl (and all other antigen bundles):

View File

@ -82,6 +82,7 @@ from .utils import (
sanitize_url, sanitize_url,
sanitized_Request, sanitized_Request,
std_headers, std_headers,
str_or_none,
subtitles_filename, subtitles_filename,
UnavailableVideoError, UnavailableVideoError,
url_basename, url_basename,
@ -308,6 +309,8 @@ class YoutubeDL(object):
The following options are used by the post processors: The following options are used by the post processors:
prefer_ffmpeg: If False, use avconv instead of ffmpeg if both are available, prefer_ffmpeg: If False, use avconv instead of ffmpeg if both are available,
otherwise prefer ffmpeg. otherwise prefer ffmpeg.
ffmpeg_location: Location of the ffmpeg/avconv binary; either the path
to the binary or its containing directory.
postprocessor_args: A list of additional command-line arguments for the postprocessor_args: A list of additional command-line arguments for the
postprocessor. postprocessor.
@ -397,9 +400,9 @@ class YoutubeDL(object):
else: else:
raise raise
if (sys.platform != 'win32' and if (sys.platform != 'win32'
sys.getfilesystemencoding() in ['ascii', 'ANSI_X3.4-1968'] and and sys.getfilesystemencoding() in ['ascii', 'ANSI_X3.4-1968']
not params.get('restrictfilenames', False)): and not params.get('restrictfilenames', False)):
# Unicode filesystem API will throw errors (#1474, #13027) # Unicode filesystem API will throw errors (#1474, #13027)
self.report_warning( self.report_warning(
'Assuming --restrict-filenames since file system encoding ' 'Assuming --restrict-filenames since file system encoding '
@ -437,9 +440,9 @@ class YoutubeDL(object):
if re.match(r'^-[0-9A-Za-z_-]{10}$', a)] if re.match(r'^-[0-9A-Za-z_-]{10}$', a)]
if idxs: if idxs:
correct_argv = ( correct_argv = (
['youtube-dl'] + ['youtube-dl']
[a for i, a in enumerate(argv) if i not in idxs] + + [a for i, a in enumerate(argv) if i not in idxs]
['--'] + [argv[i] for i in idxs] + ['--'] + [argv[i] for i in idxs]
) )
self.report_warning( self.report_warning(
'Long argument string detected. ' 'Long argument string detected. '
@ -847,10 +850,11 @@ class YoutubeDL(object):
if result_type in ('url', 'url_transparent'): if result_type in ('url', 'url_transparent'):
ie_result['url'] = sanitize_url(ie_result['url']) ie_result['url'] = sanitize_url(ie_result['url'])
extract_flat = self.params.get('extract_flat', False) extract_flat = self.params.get('extract_flat', False)
if ((extract_flat == 'in_playlist' and 'playlist' in extra_info) or if ((extract_flat == 'in_playlist' and 'playlist' in extra_info)
extract_flat is True): or extract_flat is True):
if self.params.get('forcejson', False): self.__forced_printings(
self.to_stdout(json.dumps(ie_result)) ie_result, self.prepare_filename(ie_result),
incomplete=True)
return ie_result return ie_result
if result_type == 'video': if result_type == 'video':
@ -888,7 +892,7 @@ class YoutubeDL(object):
# url_transparent. In such cases outer metadata (from ie_result) # url_transparent. In such cases outer metadata (from ie_result)
# should be propagated to inner one (info). For this to happen # should be propagated to inner one (info). For this to happen
# _type of info should be overridden with url_transparent. This # _type of info should be overridden with url_transparent. This
# fixes issue from https://github.com/rg3/youtube-dl/pull/11163. # fixes issue from https://github.com/ytdl-org/youtube-dl/pull/11163.
if new_result.get('_type') == 'url': if new_result.get('_type') == 'url':
new_result['_type'] = 'url_transparent' new_result['_type'] = 'url_transparent'
@ -1063,21 +1067,24 @@ class YoutubeDL(object):
if not m: if not m:
STR_OPERATORS = { STR_OPERATORS = {
'=': operator.eq, '=': operator.eq,
'!=': operator.ne,
'^=': lambda attr, value: attr.startswith(value), '^=': lambda attr, value: attr.startswith(value),
'$=': lambda attr, value: attr.endswith(value), '$=': lambda attr, value: attr.endswith(value),
'*=': lambda attr, value: value in attr, '*=': lambda attr, value: value in attr,
} }
str_operator_rex = re.compile(r'''(?x) str_operator_rex = re.compile(r'''(?x)
\s*(?P<key>ext|acodec|vcodec|container|protocol|format_id) \s*(?P<key>ext|acodec|vcodec|container|protocol|format_id)
\s*(?P<op>%s)(?P<none_inclusive>\s*\?)? \s*(?P<negation>!\s*)?(?P<op>%s)(?P<none_inclusive>\s*\?)?
\s*(?P<value>[a-zA-Z0-9._-]+) \s*(?P<value>[a-zA-Z0-9._-]+)
\s*$ \s*$
''' % '|'.join(map(re.escape, STR_OPERATORS.keys()))) ''' % '|'.join(map(re.escape, STR_OPERATORS.keys())))
m = str_operator_rex.search(filter_spec) m = str_operator_rex.search(filter_spec)
if m: if m:
comparison_value = m.group('value') comparison_value = m.group('value')
op = STR_OPERATORS[m.group('op')] str_op = STR_OPERATORS[m.group('op')]
if m.group('negation'):
op = lambda attr, value: not str_op(attr, value)
else:
op = str_op
if not m: if not m:
raise ValueError('Invalid filter specification %r' % filter_spec) raise ValueError('Invalid filter specification %r' % filter_spec)
@ -1602,7 +1609,7 @@ class YoutubeDL(object):
# by extractor are incomplete or not (i.e. whether extractor provides only # by extractor are incomplete or not (i.e. whether extractor provides only
# video-only or audio-only formats) for proper formats selection for # video-only or audio-only formats) for proper formats selection for
# extractors with such incomplete formats (see # extractors with such incomplete formats (see
# https://github.com/rg3/youtube-dl/pull/5556). # https://github.com/ytdl-org/youtube-dl/pull/5556).
# Since formats may be filtered during format selection and may not match # Since formats may be filtered during format selection and may not match
# the original formats the results may be incorrect. Thus original formats # the original formats the results may be incorrect. Thus original formats
# or pre-calculated metrics should be passed to format selection routines # or pre-calculated metrics should be passed to format selection routines
@ -1610,12 +1617,12 @@ class YoutubeDL(object):
# We will pass a context object containing all necessary additional data # We will pass a context object containing all necessary additional data
# instead of just formats. # instead of just formats.
# This fixes incorrect format selection issue (see # This fixes incorrect format selection issue (see
# https://github.com/rg3/youtube-dl/issues/10083). # https://github.com/ytdl-org/youtube-dl/issues/10083).
incomplete_formats = ( incomplete_formats = (
# All formats are video-only or # All formats are video-only or
all(f.get('vcodec') != 'none' and f.get('acodec') == 'none' for f in formats) or all(f.get('vcodec') != 'none' and f.get('acodec') == 'none' for f in formats)
# all formats are audio-only # all formats are audio-only
all(f.get('vcodec') == 'none' and f.get('acodec') != 'none' for f in formats)) or all(f.get('vcodec') == 'none' and f.get('acodec') != 'none' for f in formats))
ctx = { ctx = {
'formats': formats, 'formats': formats,
@ -1687,6 +1694,36 @@ class YoutubeDL(object):
subs[lang] = f subs[lang] = f
return subs return subs
def __forced_printings(self, info_dict, filename, incomplete):
def print_mandatory(field):
if (self.params.get('force%s' % field, False)
and (not incomplete or info_dict.get(field) is not None)):
self.to_stdout(info_dict[field])
def print_optional(field):
if (self.params.get('force%s' % field, False)
and info_dict.get(field) is not None):
self.to_stdout(info_dict[field])
print_mandatory('title')
print_mandatory('id')
if self.params.get('forceurl', False) and not incomplete:
if info_dict.get('requested_formats') is not None:
for f in info_dict['requested_formats']:
self.to_stdout(f['url'] + f.get('play_path', ''))
else:
# For RTMP URLs, also include the playpath
self.to_stdout(info_dict['url'] + info_dict.get('play_path', ''))
print_optional('thumbnail')
print_optional('description')
if self.params.get('forcefilename', False) and filename is not None:
self.to_stdout(filename)
if self.params.get('forceduration', False) and info_dict.get('duration') is not None:
self.to_stdout(formatSeconds(info_dict['duration']))
print_mandatory('format')
if self.params.get('forcejson', False):
self.to_stdout(json.dumps(info_dict))
def process_info(self, info_dict): def process_info(self, info_dict):
"""Process a single resolved IE result.""" """Process a single resolved IE result."""
@ -1697,9 +1734,8 @@ class YoutubeDL(object):
if self._num_downloads >= int(max_downloads): if self._num_downloads >= int(max_downloads):
raise MaxDownloadsReached() raise MaxDownloadsReached()
# TODO: backward compatibility, to be removed
info_dict['fulltitle'] = info_dict['title'] info_dict['fulltitle'] = info_dict['title']
if len(info_dict['title']) > 200:
info_dict['title'] = info_dict['title'][:197] + '...'
if 'format' not in info_dict: if 'format' not in info_dict:
info_dict['format'] = info_dict['ext'] info_dict['format'] = info_dict['ext']
@ -1714,29 +1750,7 @@ class YoutubeDL(object):
info_dict['_filename'] = filename = self.prepare_filename(info_dict) info_dict['_filename'] = filename = self.prepare_filename(info_dict)
# Forced printings # Forced printings
if self.params.get('forcetitle', False): self.__forced_printings(info_dict, filename, incomplete=False)
self.to_stdout(info_dict['fulltitle'])
if self.params.get('forceid', False):
self.to_stdout(info_dict['id'])
if self.params.get('forceurl', False):
if info_dict.get('requested_formats') is not None:
for f in info_dict['requested_formats']:
self.to_stdout(f['url'] + f.get('play_path', ''))
else:
# For RTMP URLs, also include the playpath
self.to_stdout(info_dict['url'] + info_dict.get('play_path', ''))
if self.params.get('forcethumbnail', False) and info_dict.get('thumbnail') is not None:
self.to_stdout(info_dict['thumbnail'])
if self.params.get('forcedescription', False) and info_dict.get('description') is not None:
self.to_stdout(info_dict['description'])
if self.params.get('forcefilename', False) and filename is not None:
self.to_stdout(filename)
if self.params.get('forceduration', False) and info_dict.get('duration') is not None:
self.to_stdout(formatSeconds(info_dict['duration']))
if self.params.get('forceformat', False):
self.to_stdout(info_dict['format'])
if self.params.get('forcejson', False):
self.to_stdout(json.dumps(info_dict))
# Do nothing else if in simulate mode # Do nothing else if in simulate mode
if self.params.get('simulate', False): if self.params.get('simulate', False):
@ -1777,6 +1791,8 @@ class YoutubeDL(object):
annofn = replace_extension(filename, 'annotations.xml', info_dict.get('ext')) annofn = replace_extension(filename, 'annotations.xml', info_dict.get('ext'))
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(annofn)): if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(annofn)):
self.to_screen('[info] Video annotations are already present') self.to_screen('[info] Video annotations are already present')
elif not info_dict.get('annotations'):
self.report_warning('There are no annotations to write.')
else: else:
try: try:
self.to_screen('[info] Writing video annotations to: ' + annofn) self.to_screen('[info] Writing video annotations to: ' + annofn)
@ -1798,7 +1814,7 @@ class YoutubeDL(object):
ie = self.get_info_extractor(info_dict['extractor_key']) ie = self.get_info_extractor(info_dict['extractor_key'])
for sub_lang, sub_info in subtitles.items(): for sub_lang, sub_info in subtitles.items():
sub_format = sub_info['ext'] sub_format = sub_info['ext']
sub_filename = subtitles_filename(filename, sub_lang, sub_format) sub_filename = subtitles_filename(filename, sub_lang, sub_format, info_dict.get('ext'))
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(sub_filename)): if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(sub_filename)):
self.to_screen('[info] Video subtitle %s.%s is already present' % (sub_lang, sub_format)) self.to_screen('[info] Video subtitle %s.%s is already present' % (sub_lang, sub_format))
else: else:
@ -1806,7 +1822,7 @@ class YoutubeDL(object):
if sub_info.get('data') is not None: if sub_info.get('data') is not None:
try: try:
# Use newline='' to prevent conversion of newline characters # Use newline='' to prevent conversion of newline characters
# See https://github.com/rg3/youtube-dl/issues/10268 # See https://github.com/ytdl-org/youtube-dl/issues/10268
with io.open(encodeFilename(sub_filename), 'w', encoding='utf-8', newline='') as subfile: with io.open(encodeFilename(sub_filename), 'w', encoding='utf-8', newline='') as subfile:
subfile.write(sub_info['data']) subfile.write(sub_info['data'])
except (OSError, IOError): except (OSError, IOError):
@ -1941,8 +1957,8 @@ class YoutubeDL(object):
else: else:
assert fixup_policy in ('ignore', 'never') assert fixup_policy in ('ignore', 'never')
if (info_dict.get('requested_formats') is None and if (info_dict.get('requested_formats') is None
info_dict.get('container') == 'm4a_dash'): and info_dict.get('container') == 'm4a_dash'):
if fixup_policy == 'warn': if fixup_policy == 'warn':
self.report_warning( self.report_warning(
'%s: writing DASH m4a. ' '%s: writing DASH m4a. '
@ -1961,9 +1977,9 @@ class YoutubeDL(object):
else: else:
assert fixup_policy in ('ignore', 'never') assert fixup_policy in ('ignore', 'never')
if (info_dict.get('protocol') == 'm3u8_native' or if (info_dict.get('protocol') == 'm3u8_native'
info_dict.get('protocol') == 'm3u8' and or info_dict.get('protocol') == 'm3u8'
self.params.get('hls_prefer_native')): and self.params.get('hls_prefer_native')):
if fixup_policy == 'warn': if fixup_policy == 'warn':
self.report_warning('%s: malformed AAC bitstream detected.' % ( self.report_warning('%s: malformed AAC bitstream detected.' % (
info_dict['id'])) info_dict['id']))
@ -1989,10 +2005,10 @@ class YoutubeDL(object):
def download(self, url_list): def download(self, url_list):
"""Download a given list of URLs.""" """Download a given list of URLs."""
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL) outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
if (len(url_list) > 1 and if (len(url_list) > 1
outtmpl != '-' and and outtmpl != '-'
'%' not in outtmpl and and '%' not in outtmpl
self.params.get('max_downloads') != 1): and self.params.get('max_downloads') != 1):
raise SameFileError(outtmpl) raise SameFileError(outtmpl)
for url in url_list: for url in url_list:
@ -2057,15 +2073,24 @@ class YoutubeDL(object):
self.report_warning('Unable to remove downloaded original file') self.report_warning('Unable to remove downloaded original file')
def _make_archive_id(self, info_dict): def _make_archive_id(self, info_dict):
video_id = info_dict.get('id')
if not video_id:
return
# Future-proof against any change in case # Future-proof against any change in case
# and backwards compatibility with prior versions # and backwards compatibility with prior versions
extractor = info_dict.get('extractor_key') extractor = info_dict.get('extractor_key') or info_dict.get('ie_key') # key in a playlist
if extractor is None: if extractor is None:
if 'id' in info_dict: url = str_or_none(info_dict.get('url'))
extractor = info_dict.get('ie_key') # key in a playlist if not url:
if extractor is None: return
return None # Incomplete video information # Try to find matching extractor for the URL and take its ie_key
return extractor.lower() + ' ' + info_dict['id'] for ie in self._ies:
if ie.suitable(url):
extractor = ie.ie_key()
break
else:
return
return extractor.lower() + ' ' + video_id
def in_download_archive(self, info_dict): def in_download_archive(self, info_dict):
fn = self.params.get('download_archive') fn = self.params.get('download_archive')
@ -2073,7 +2098,7 @@ class YoutubeDL(object):
return False return False
vid_id = self._make_archive_id(info_dict) vid_id = self._make_archive_id(info_dict)
if vid_id is None: if not vid_id:
return False # Incomplete video information return False # Incomplete video information
try: try:
@ -2128,8 +2153,8 @@ class YoutubeDL(object):
if res: if res:
res += ', ' res += ', '
res += '%s container' % fdict['container'] res += '%s container' % fdict['container']
if (fdict.get('vcodec') is not None and if (fdict.get('vcodec') is not None
fdict.get('vcodec') != 'none'): and fdict.get('vcodec') != 'none'):
if res: if res:
res += ', ' res += ', '
res += fdict['vcodec'] res += fdict['vcodec']
@ -2216,7 +2241,7 @@ class YoutubeDL(object):
return return
if type('') is not compat_str: if type('') is not compat_str:
# Python 2.6 on SLES11 SP1 (https://github.com/rg3/youtube-dl/issues/3326) # Python 2.6 on SLES11 SP1 (https://github.com/ytdl-org/youtube-dl/issues/3326)
self.report_warning( self.report_warning(
'Your Python is broken! Update to a newer and supported version') 'Your Python is broken! Update to a newer and supported version')
@ -2310,7 +2335,7 @@ class YoutubeDL(object):
proxies = {'http': opts_proxy, 'https': opts_proxy} proxies = {'http': opts_proxy, 'https': opts_proxy}
else: else:
proxies = compat_urllib_request.getproxies() proxies = compat_urllib_request.getproxies()
# Set HTTPS proxy to HTTP one if given (https://github.com/rg3/youtube-dl/issues/805) # Set HTTPS proxy to HTTP one if given (https://github.com/ytdl-org/youtube-dl/issues/805)
if 'http' in proxies and 'https' not in proxies: if 'http' in proxies and 'https' not in proxies:
proxies['https'] = proxies['http'] proxies['https'] = proxies['http']
proxy_handler = PerRequestProxyHandler(proxies) proxy_handler = PerRequestProxyHandler(proxies)
@ -2323,7 +2348,7 @@ class YoutubeDL(object):
# When passing our own FileHandler instance, build_opener won't add the # When passing our own FileHandler instance, build_opener won't add the
# default FileHandler and allows us to disable the file protocol, which # default FileHandler and allows us to disable the file protocol, which
# can be used for malicious purposes (see # can be used for malicious purposes (see
# https://github.com/rg3/youtube-dl/issues/8227) # https://github.com/ytdl-org/youtube-dl/issues/8227)
file_handler = compat_urllib_request.FileHandler() file_handler = compat_urllib_request.FileHandler()
def file_open(*args, **kwargs): def file_open(*args, **kwargs):
@ -2335,7 +2360,7 @@ class YoutubeDL(object):
# Delete the default user-agent header, which would otherwise apply in # Delete the default user-agent header, which would otherwise apply in
# cases where our custom HTTP handler doesn't come into play # cases where our custom HTTP handler doesn't come into play
# (See https://github.com/rg3/youtube-dl/issues/1309 for details) # (See https://github.com/ytdl-org/youtube-dl/issues/1309 for details)
opener.addheaders = [] opener.addheaders = []
self._opener = opener self._opener = opener

View File

@ -48,7 +48,7 @@ from .YoutubeDL import YoutubeDL
def _real_main(argv=None): def _real_main(argv=None):
# Compatibility fixes for Windows # Compatibility fixes for Windows
if sys.platform == 'win32': if sys.platform == 'win32':
# https://github.com/rg3/youtube-dl/issues/820 # https://github.com/ytdl-org/youtube-dl/issues/820
codecs.register(lambda name: codecs.lookup('utf-8') if name == 'cp65001' else None) codecs.register(lambda name: codecs.lookup('utf-8') if name == 'cp65001' else None)
workaround_optparse_bug9161() workaround_optparse_bug9161()
@ -94,7 +94,7 @@ def _real_main(argv=None):
if opts.verbose: if opts.verbose:
write_string('[debug] Batch file urls: ' + repr(batch_urls) + '\n') write_string('[debug] Batch file urls: ' + repr(batch_urls) + '\n')
except IOError: except IOError:
sys.exit('ERROR: batch file could not be read') sys.exit('ERROR: batch file %s could not be read' % opts.batchfile)
all_urls = batch_urls + [url.strip() for url in args] # batch_urls are already striped in read_batch_urls all_urls = batch_urls + [url.strip() for url in args] # batch_urls are already striped in read_batch_urls
_enc = preferredencoding() _enc = preferredencoding()
all_urls = [url.decode(_enc, 'ignore') if isinstance(url, bytes) else url for url in all_urls] all_urls = [url.decode(_enc, 'ignore') if isinstance(url, bytes) else url for url in all_urls]
@ -166,6 +166,8 @@ def _real_main(argv=None):
if opts.max_sleep_interval is not None: if opts.max_sleep_interval is not None:
if opts.max_sleep_interval < 0: if opts.max_sleep_interval < 0:
parser.error('max sleep interval must be positive or 0') parser.error('max sleep interval must be positive or 0')
if opts.sleep_interval is None:
parser.error('min sleep interval must be specified, use --min-sleep-interval')
if opts.max_sleep_interval < opts.sleep_interval: if opts.max_sleep_interval < opts.sleep_interval:
parser.error('max sleep interval must be greater than or equal to min sleep interval') parser.error('max sleep interval must be greater than or equal to min sleep interval')
else: else:
@ -228,14 +230,14 @@ def _real_main(argv=None):
if opts.allsubtitles and not opts.writeautomaticsub: if opts.allsubtitles and not opts.writeautomaticsub:
opts.writesubtitles = True opts.writesubtitles = True
outtmpl = ((opts.outtmpl is not None and opts.outtmpl) or outtmpl = ((opts.outtmpl is not None and opts.outtmpl)
(opts.format == '-1' and opts.usetitle and '%(title)s-%(id)s-%(format)s.%(ext)s') or or (opts.format == '-1' and opts.usetitle and '%(title)s-%(id)s-%(format)s.%(ext)s')
(opts.format == '-1' and '%(id)s-%(format)s.%(ext)s') or or (opts.format == '-1' and '%(id)s-%(format)s.%(ext)s')
(opts.usetitle and opts.autonumber and '%(autonumber)s-%(title)s-%(id)s.%(ext)s') or or (opts.usetitle and opts.autonumber and '%(autonumber)s-%(title)s-%(id)s.%(ext)s')
(opts.usetitle and '%(title)s-%(id)s.%(ext)s') or or (opts.usetitle and '%(title)s-%(id)s.%(ext)s')
(opts.useid and '%(id)s.%(ext)s') or or (opts.useid and '%(id)s.%(ext)s')
(opts.autonumber and '%(autonumber)s-%(id)s.%(ext)s') or or (opts.autonumber and '%(autonumber)s-%(id)s.%(ext)s')
DEFAULT_OUTTMPL) or DEFAULT_OUTTMPL)
if not os.path.splitext(outtmpl)[1] and opts.extractaudio: if not os.path.splitext(outtmpl)[1] and opts.extractaudio:
parser.error('Cannot download a video and extract audio into the same' parser.error('Cannot download a video and extract audio into the same'
' file! Use "{0}.%(ext)s" instead of "{0}" as the output' ' file! Use "{0}.%(ext)s" instead of "{0}" as the output'

View File

@ -2364,7 +2364,7 @@ except ImportError: # Python 2
# HACK: The following are the correct unquote_to_bytes, unquote and unquote_plus # HACK: The following are the correct unquote_to_bytes, unquote and unquote_plus
# implementations from cpython 3.4.3's stdlib. Python 2's version # implementations from cpython 3.4.3's stdlib. Python 2's version
# is apparently broken (see https://github.com/rg3/youtube-dl/pull/6244) # is apparently broken (see https://github.com/ytdl-org/youtube-dl/pull/6244)
def compat_urllib_parse_unquote_to_bytes(string): def compat_urllib_parse_unquote_to_bytes(string):
"""unquote_to_bytes('abc%20def') -> b'abc def'.""" """unquote_to_bytes('abc%20def') -> b'abc def'."""
@ -2508,6 +2508,15 @@ class _TreeBuilder(etree.TreeBuilder):
pass pass
try:
# xml.etree.ElementTree.Element is a method in Python <=2.6 and
# the following will crash with:
# TypeError: isinstance() arg 2 must be a class, type, or tuple of classes and types
isinstance(None, xml.etree.ElementTree.Element)
from xml.etree.ElementTree import Element as compat_etree_Element
except TypeError: # Python <=2.6
from xml.etree.ElementTree import _ElementInterface as compat_etree_Element
if sys.version_info[0] >= 3: if sys.version_info[0] >= 3:
def compat_etree_fromstring(text): def compat_etree_fromstring(text):
return etree.XML(text, parser=etree.XMLParser(target=_TreeBuilder())) return etree.XML(text, parser=etree.XMLParser(target=_TreeBuilder()))
@ -2640,9 +2649,9 @@ else:
try: try:
args = shlex.split('中文') args = shlex.split('中文')
assert (isinstance(args, list) and assert (isinstance(args, list)
isinstance(args[0], compat_str) and and isinstance(args[0], compat_str)
args[0] == '中文') and args[0] == '中文')
compat_shlex_split = shlex.split compat_shlex_split = shlex.split
except (AssertionError, UnicodeEncodeError): except (AssertionError, UnicodeEncodeError):
# Working around shlex issue with unicode strings on some python 2 # Working around shlex issue with unicode strings on some python 2
@ -2819,7 +2828,7 @@ else:
compat_socket_create_connection = socket.create_connection compat_socket_create_connection = socket.create_connection
# Fix https://github.com/rg3/youtube-dl/issues/4223 # Fix https://github.com/ytdl-org/youtube-dl/issues/4223
# See http://bugs.python.org/issue9161 for what is broken # See http://bugs.python.org/issue9161 for what is broken
def workaround_optparse_bug9161(): def workaround_optparse_bug9161():
op = optparse.OptionParser() op = optparse.OptionParser()
@ -2944,7 +2953,7 @@ if platform.python_implementation() == 'PyPy' and sys.pypy_version_info < (5, 4,
# PyPy2 prior to version 5.4.0 expects byte strings as Windows function # PyPy2 prior to version 5.4.0 expects byte strings as Windows function
# names, see the original PyPy issue [1] and the youtube-dl one [2]. # names, see the original PyPy issue [1] and the youtube-dl one [2].
# 1. https://bitbucket.org/pypy/pypy/issues/2360/windows-ctypescdll-typeerror-function-name # 1. https://bitbucket.org/pypy/pypy/issues/2360/windows-ctypescdll-typeerror-function-name
# 2. https://github.com/rg3/youtube-dl/pull/4392 # 2. https://github.com/ytdl-org/youtube-dl/pull/4392
def compat_ctypes_WINFUNCTYPE(*args, **kwargs): def compat_ctypes_WINFUNCTYPE(*args, **kwargs):
real = ctypes.WINFUNCTYPE(*args, **kwargs) real = ctypes.WINFUNCTYPE(*args, **kwargs)
@ -2969,6 +2978,7 @@ __all__ = [
'compat_cookiejar', 'compat_cookiejar',
'compat_cookies', 'compat_cookies',
'compat_ctypes_WINFUNCTYPE', 'compat_ctypes_WINFUNCTYPE',
'compat_etree_Element',
'compat_etree_fromstring', 'compat_etree_fromstring',
'compat_etree_register_namespace', 'compat_etree_register_namespace',
'compat_expanduser', 'compat_expanduser',

View File

@ -176,7 +176,9 @@ class FileDownloader(object):
return return
speed = float(byte_counter) / elapsed speed = float(byte_counter) / elapsed
if speed > rate_limit: if speed > rate_limit:
time.sleep(max((byte_counter // rate_limit) - elapsed, 0)) sleep_time = float(byte_counter) / rate_limit - elapsed
if sleep_time > 0:
time.sleep(sleep_time)
def temp_name(self, filename): def temp_name(self, filename):
"""Returns a temporary filename for the given filename.""" """Returns a temporary filename for the given filename."""
@ -330,15 +332,15 @@ class FileDownloader(object):
""" """
nooverwrites_and_exists = ( nooverwrites_and_exists = (
self.params.get('nooverwrites', False) and self.params.get('nooverwrites', False)
os.path.exists(encodeFilename(filename)) and os.path.exists(encodeFilename(filename))
) )
if not hasattr(filename, 'write'): if not hasattr(filename, 'write'):
continuedl_and_exists = ( continuedl_and_exists = (
self.params.get('continuedl', True) and self.params.get('continuedl', True)
os.path.isfile(encodeFilename(filename)) and and os.path.isfile(encodeFilename(filename))
not self.params.get('nopart', False) and not self.params.get('nopart', False)
) )
# Check file already present # Check file already present

View File

@ -53,7 +53,7 @@ class DashSegmentsFD(FragmentFD):
except compat_urllib_error.HTTPError as err: except compat_urllib_error.HTTPError as err:
# YouTube may often return 404 HTTP error for a fragment causing the # YouTube may often return 404 HTTP error for a fragment causing the
# whole download to fail. However if the same fragment is immediately # whole download to fail. However if the same fragment is immediately
# retried with the same request data this usually succeeds (1-2 attemps # retried with the same request data this usually succeeds (1-2 attempts
# is usually enough) thus allowing to download the whole file successfully. # is usually enough) thus allowing to download the whole file successfully.
# To be future-proof we will retry all fragments that fail with any # To be future-proof we will retry all fragments that fail with any
# HTTP error. # HTTP error.

View File

@ -121,7 +121,11 @@ class CurlFD(ExternalFD):
cmd += self._valueless_option('--silent', 'noprogress') cmd += self._valueless_option('--silent', 'noprogress')
cmd += self._valueless_option('--verbose', 'verbose') cmd += self._valueless_option('--verbose', 'verbose')
cmd += self._option('--limit-rate', 'ratelimit') cmd += self._option('--limit-rate', 'ratelimit')
cmd += self._option('--retry', 'retries') retry = self._option('--retry', 'retries')
if len(retry) == 2:
if retry[1] in ('inf', 'infinite'):
retry[1] = '2147483647'
cmd += retry
cmd += self._option('--max-filesize', 'max_filesize') cmd += self._option('--max-filesize', 'max_filesize')
cmd += self._option('--interface', 'source_address') cmd += self._option('--interface', 'source_address')
cmd += self._option('--proxy', 'proxy') cmd += self._option('--proxy', 'proxy')
@ -160,6 +164,12 @@ class WgetFD(ExternalFD):
cmd = [self.exe, '-O', tmpfilename, '-nv', '--no-cookies'] cmd = [self.exe, '-O', tmpfilename, '-nv', '--no-cookies']
for key, val in info_dict['http_headers'].items(): for key, val in info_dict['http_headers'].items():
cmd += ['--header', '%s: %s' % (key, val)] cmd += ['--header', '%s: %s' % (key, val)]
cmd += self._option('--limit-rate', 'ratelimit')
retry = self._option('--tries', 'retries')
if len(retry) == 2:
if retry[1] in ('inf', 'infinite'):
retry[1] = '0'
cmd += retry
cmd += self._option('--bind-address', 'source_address') cmd += self._option('--bind-address', 'source_address')
cmd += self._option('--proxy', 'proxy') cmd += self._option('--proxy', 'proxy')
cmd += self._valueless_option('--no-check-certificate', 'nocheckcertificate') cmd += self._valueless_option('--no-check-certificate', 'nocheckcertificate')
@ -184,6 +194,7 @@ class Aria2cFD(ExternalFD):
cmd += self._option('--interface', 'source_address') cmd += self._option('--interface', 'source_address')
cmd += self._option('--all-proxy', 'proxy') cmd += self._option('--all-proxy', 'proxy')
cmd += self._bool_option('--check-certificate', 'nocheckcertificate', 'false', 'true', '=') cmd += self._bool_option('--check-certificate', 'nocheckcertificate', 'false', 'true', '=')
cmd += self._bool_option('--remote-time', 'updatetime', 'true', 'false', '=')
cmd += ['--', info_dict['url']] cmd += ['--', info_dict['url']]
return cmd return cmd
@ -229,7 +240,7 @@ class FFmpegFD(ExternalFD):
# setting -seekable prevents ffmpeg from guessing if the server # setting -seekable prevents ffmpeg from guessing if the server
# supports seeking(by adding the header `Range: bytes=0-`), which # supports seeking(by adding the header `Range: bytes=0-`), which
# can cause problems in some cases # can cause problems in some cases
# https://github.com/rg3/youtube-dl/issues/11800#issuecomment-275037127 # https://github.com/ytdl-org/youtube-dl/issues/11800#issuecomment-275037127
# http://trac.ffmpeg.org/ticket/6125#comment:10 # http://trac.ffmpeg.org/ticket/6125#comment:10
args += ['-seekable', '1' if seekable else '0'] args += ['-seekable', '1' if seekable else '0']
@ -279,6 +290,7 @@ class FFmpegFD(ExternalFD):
tc_url = info_dict.get('tc_url') tc_url = info_dict.get('tc_url')
flash_version = info_dict.get('flash_version') flash_version = info_dict.get('flash_version')
live = info_dict.get('rtmp_live', False) live = info_dict.get('rtmp_live', False)
conn = info_dict.get('rtmp_conn')
if player_url is not None: if player_url is not None:
args += ['-rtmp_swfverify', player_url] args += ['-rtmp_swfverify', player_url]
if page_url is not None: if page_url is not None:
@ -293,6 +305,11 @@ class FFmpegFD(ExternalFD):
args += ['-rtmp_flashver', flash_version] args += ['-rtmp_flashver', flash_version]
if live: if live:
args += ['-rtmp_live', 'live'] args += ['-rtmp_live', 'live']
if isinstance(conn, list):
for entry in conn:
args += ['-rtmp_conn', entry]
elif isinstance(conn, compat_str):
args += ['-rtmp_conn', conn]
args += ['-i', url, '-c', 'copy'] args += ['-i', url, '-c', 'copy']
@ -324,7 +341,7 @@ class FFmpegFD(ExternalFD):
# mp4 file couldn't be played, but if we ask ffmpeg to quit it # mp4 file couldn't be played, but if we ask ffmpeg to quit it
# produces a file that is playable (this is mostly useful for live # produces a file that is playable (this is mostly useful for live
# streams). Note that Windows is not affected and produces playable # streams). Note that Windows is not affected and produces playable
# files (see https://github.com/rg3/youtube-dl/issues/8300). # files (see https://github.com/ytdl-org/youtube-dl/issues/8300).
if sys.platform != 'win32': if sys.platform != 'win32':
proc.communicate(b'q') proc.communicate(b'q')
raise raise

View File

@ -238,8 +238,8 @@ def write_metadata_tag(stream, metadata):
def remove_encrypted_media(media): def remove_encrypted_media(media):
return list(filter(lambda e: 'drmAdditionalHeaderId' not in e.attrib and return list(filter(lambda e: 'drmAdditionalHeaderId' not in e.attrib
'drmAdditionalHeaderSetId' not in e.attrib, and 'drmAdditionalHeaderSetId' not in e.attrib,
media)) media))
@ -267,8 +267,8 @@ class F4mFD(FragmentFD):
media = doc.findall(_add_ns('media')) media = doc.findall(_add_ns('media'))
if not media: if not media:
self.report_error('No media found') self.report_error('No media found')
for e in (doc.findall(_add_ns('drmAdditionalHeader')) + for e in (doc.findall(_add_ns('drmAdditionalHeader'))
doc.findall(_add_ns('drmAdditionalHeaderSet'))): + doc.findall(_add_ns('drmAdditionalHeaderSet'))):
# If id attribute is missing it's valid for all media nodes # If id attribute is missing it's valid for all media nodes
# without drmAdditionalHeaderId or drmAdditionalHeaderSetId attribute # without drmAdditionalHeaderId or drmAdditionalHeaderSetId attribute
if 'id' not in e.attrib: if 'id' not in e.attrib:
@ -324,8 +324,8 @@ class F4mFD(FragmentFD):
urlh = self.ydl.urlopen(self._prepare_url(info_dict, man_url)) urlh = self.ydl.urlopen(self._prepare_url(info_dict, man_url))
man_url = urlh.geturl() man_url = urlh.geturl()
# Some manifests may be malformed, e.g. prosiebensat1 generated manifests # Some manifests may be malformed, e.g. prosiebensat1 generated manifests
# (see https://github.com/rg3/youtube-dl/issues/6215#issuecomment-121704244 # (see https://github.com/ytdl-org/youtube-dl/issues/6215#issuecomment-121704244
# and https://github.com/rg3/youtube-dl/issues/7823) # and https://github.com/ytdl-org/youtube-dl/issues/7823)
manifest = fix_xml_ampersands(urlh.read().decode('utf-8', 'ignore')).strip() manifest = fix_xml_ampersands(urlh.read().decode('utf-8', 'ignore')).strip()
doc = compat_etree_fromstring(manifest) doc = compat_etree_fromstring(manifest)
@ -409,7 +409,7 @@ class F4mFD(FragmentFD):
# In tests, segments may be truncated, and thus # In tests, segments may be truncated, and thus
# FlvReader may not be able to parse the whole # FlvReader may not be able to parse the whole
# chunk. If so, write the segment as is # chunk. If so, write the segment as is
# See https://github.com/rg3/youtube-dl/issues/9214 # See https://github.com/ytdl-org/youtube-dl/issues/9214
dest_stream.write(down_data) dest_stream.write(down_data)
break break
raise raise

View File

@ -190,12 +190,13 @@ class FragmentFD(FileDownloader):
}) })
def _start_frag_download(self, ctx): def _start_frag_download(self, ctx):
resume_len = ctx['complete_frags_downloaded_bytes']
total_frags = ctx['total_frags'] total_frags = ctx['total_frags']
# This dict stores the download progress, it's updated by the progress # This dict stores the download progress, it's updated by the progress
# hook # hook
state = { state = {
'status': 'downloading', 'status': 'downloading',
'downloaded_bytes': ctx['complete_frags_downloaded_bytes'], 'downloaded_bytes': resume_len,
'fragment_index': ctx['fragment_index'], 'fragment_index': ctx['fragment_index'],
'fragment_count': total_frags, 'fragment_count': total_frags,
'filename': ctx['filename'], 'filename': ctx['filename'],
@ -219,8 +220,8 @@ class FragmentFD(FileDownloader):
frag_total_bytes = s.get('total_bytes') or 0 frag_total_bytes = s.get('total_bytes') or 0
if not ctx['live']: if not ctx['live']:
estimated_size = ( estimated_size = (
(ctx['complete_frags_downloaded_bytes'] + frag_total_bytes) / (ctx['complete_frags_downloaded_bytes'] + frag_total_bytes)
(state['fragment_index'] + 1) * total_frags) / (state['fragment_index'] + 1) * total_frags)
state['total_bytes_estimate'] = estimated_size state['total_bytes_estimate'] = estimated_size
if s['status'] == 'finished': if s['status'] == 'finished':
@ -234,8 +235,8 @@ class FragmentFD(FileDownloader):
state['downloaded_bytes'] += frag_downloaded_bytes - ctx['prev_frag_downloaded_bytes'] state['downloaded_bytes'] += frag_downloaded_bytes - ctx['prev_frag_downloaded_bytes']
if not ctx['live']: if not ctx['live']:
state['eta'] = self.calc_eta( state['eta'] = self.calc_eta(
start, time_now, estimated_size, start, time_now, estimated_size - resume_len,
state['downloaded_bytes']) state['downloaded_bytes'] - resume_len)
state['speed'] = s.get('speed') or ctx.get('speed') state['speed'] = s.get('speed') or ctx.get('speed')
ctx['speed'] = state['speed'] ctx['speed'] = state['speed']
ctx['prev_frag_downloaded_bytes'] = frag_downloaded_bytes ctx['prev_frag_downloaded_bytes'] = frag_downloaded_bytes

View File

@ -64,7 +64,7 @@ class HlsFD(FragmentFD):
s = urlh.read().decode('utf-8', 'ignore') s = urlh.read().decode('utf-8', 'ignore')
if not self.can_download(s, info_dict): if not self.can_download(s, info_dict):
if info_dict.get('extra_param_to_segment_url'): if info_dict.get('extra_param_to_segment_url') or info_dict.get('_decryption_key_url'):
self.report_error('pycrypto not found. Please install it.') self.report_error('pycrypto not found. Please install it.')
return False return False
self.report_warning( self.report_warning(
@ -76,12 +76,12 @@ class HlsFD(FragmentFD):
return fd.real_download(filename, info_dict) return fd.real_download(filename, info_dict)
def is_ad_fragment_start(s): def is_ad_fragment_start(s):
return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=ad' in s or return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=ad' in s
s.startswith('#UPLYNK-SEGMENT') and s.endswith(',ad')) or s.startswith('#UPLYNK-SEGMENT') and s.endswith(',ad'))
def is_ad_fragment_end(s): def is_ad_fragment_end(s):
return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=master' in s or return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=master' in s
s.startswith('#UPLYNK-SEGMENT') and s.endswith(',segment')) or s.startswith('#UPLYNK-SEGMENT') and s.endswith(',segment'))
media_frags = 0 media_frags = 0
ad_frags = 0 ad_frags = 0
@ -152,8 +152,8 @@ class HlsFD(FragmentFD):
except compat_urllib_error.HTTPError as err: except compat_urllib_error.HTTPError as err:
# Unavailable (possibly temporary) fragments may be served. # Unavailable (possibly temporary) fragments may be served.
# First we try to retry then either skip or abort. # First we try to retry then either skip or abort.
# See https://github.com/rg3/youtube-dl/issues/10165, # See https://github.com/ytdl-org/youtube-dl/issues/10165,
# https://github.com/rg3/youtube-dl/issues/10448). # https://github.com/ytdl-org/youtube-dl/issues/10448).
count += 1 count += 1
if count <= fragment_retries: if count <= fragment_retries:
self.report_retry_fragment(err, frag_index, count, fragment_retries) self.report_retry_fragment(err, frag_index, count, fragment_retries)
@ -169,7 +169,7 @@ class HlsFD(FragmentFD):
if decrypt_info['METHOD'] == 'AES-128': if decrypt_info['METHOD'] == 'AES-128':
iv = decrypt_info.get('IV') or compat_struct_pack('>8xq', media_sequence) iv = decrypt_info.get('IV') or compat_struct_pack('>8xq', media_sequence)
decrypt_info['KEY'] = decrypt_info.get('KEY') or self.ydl.urlopen( decrypt_info['KEY'] = decrypt_info.get('KEY') or self.ydl.urlopen(
self._prepare_url(info_dict, decrypt_info['URI'])).read() self._prepare_url(info_dict, info_dict.get('_decryption_key_url') or decrypt_info['URI'])).read()
frag_content = AES.new( frag_content = AES.new(
decrypt_info['KEY'], AES.MODE_CBC, iv).decrypt(frag_content) decrypt_info['KEY'], AES.MODE_CBC, iv).decrypt(frag_content)
self._append_fragment(ctx, frag_content) self._append_fragment(ctx, frag_content)

View File

@ -46,8 +46,8 @@ class HttpFD(FileDownloader):
is_test = self.params.get('test', False) is_test = self.params.get('test', False)
chunk_size = self._TEST_FILE_SIZE if is_test else ( chunk_size = self._TEST_FILE_SIZE if is_test else (
info_dict.get('downloader_options', {}).get('http_chunk_size') or info_dict.get('downloader_options', {}).get('http_chunk_size')
self.params.get('http_chunk_size') or 0) or self.params.get('http_chunk_size') or 0)
ctx.open_mode = 'wb' ctx.open_mode = 'wb'
ctx.resume_len = 0 ctx.resume_len = 0
@ -111,7 +111,7 @@ class HttpFD(FileDownloader):
# to match the value of requested Range HTTP header. This is due to a webservers # to match the value of requested Range HTTP header. This is due to a webservers
# that don't support resuming and serve a whole file with no Content-Range # that don't support resuming and serve a whole file with no Content-Range
# set in response despite of requested Range (see # set in response despite of requested Range (see
# https://github.com/rg3/youtube-dl/issues/6057#issuecomment-126129799) # https://github.com/ytdl-org/youtube-dl/issues/6057#issuecomment-126129799)
if has_range: if has_range:
content_range = ctx.data.headers.get('Content-Range') content_range = ctx.data.headers.get('Content-Range')
if content_range: if content_range:
@ -123,11 +123,11 @@ class HttpFD(FileDownloader):
content_len = int_or_none(content_range_m.group(3)) content_len = int_or_none(content_range_m.group(3))
accept_content_len = ( accept_content_len = (
# Non-chunked download # Non-chunked download
not ctx.chunk_size or not ctx.chunk_size
# Chunked download and requested piece or # Chunked download and requested piece or
# its part is promised to be served # its part is promised to be served
content_range_end == range_end or or content_range_end == range_end
content_len < range_end) or content_len < range_end)
if accept_content_len: if accept_content_len:
ctx.data_len = content_len ctx.data_len = content_len
return return
@ -152,8 +152,8 @@ class HttpFD(FileDownloader):
raise raise
else: else:
# Examine the reported length # Examine the reported length
if (content_length is not None and if (content_length is not None
(ctx.resume_len - 100 < int(content_length) < ctx.resume_len + 100)): and (ctx.resume_len - 100 < int(content_length) < ctx.resume_len + 100)):
# The file had already been fully downloaded. # The file had already been fully downloaded.
# Explanation to the above condition: in issue #175 it was revealed that # Explanation to the above condition: in issue #175 it was revealed that
# YouTube sometimes adds or removes a few bytes from the end of the file, # YouTube sometimes adds or removes a few bytes from the end of the file,

View File

@ -146,7 +146,7 @@ def write_piff_header(stream, params):
sps, pps = codec_private_data.split(u32.pack(1))[1:] sps, pps = codec_private_data.split(u32.pack(1))[1:]
avcc_payload = u8.pack(1) # configuration version avcc_payload = u8.pack(1) # configuration version
avcc_payload += sps[1:4] # avc profile indication + profile compatibility + avc level indication avcc_payload += sps[1:4] # avc profile indication + profile compatibility + avc level indication
avcc_payload += u8.pack(0xfc | (params.get('nal_unit_length_field', 4) - 1)) # complete represenation (1) + reserved (11111) + length size minus one avcc_payload += u8.pack(0xfc | (params.get('nal_unit_length_field', 4) - 1)) # complete representation (1) + reserved (11111) + length size minus one
avcc_payload += u8.pack(1) # reserved (0) + number of sps (0000001) avcc_payload += u8.pack(1) # reserved (0) + number of sps (0000001)
avcc_payload += u16.pack(len(sps)) avcc_payload += u16.pack(len(sps))
avcc_payload += sps avcc_payload += sps

View File

@ -15,10 +15,13 @@ class AbcNewsVideoIE(AMPIE):
IE_NAME = 'abcnews:video' IE_NAME = 'abcnews:video'
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https?:// https?://
(?:
abcnews\.go\.com/ abcnews\.go\.com/
(?: (?:
[^/]+/video/(?P<display_id>[0-9a-z-]+)-| [^/]+/video/(?P<display_id>[0-9a-z-]+)-|
video/embed\?.*?\bid= video/embed\?.*?\bid=
)|
fivethirtyeight\.abcnews\.go\.com/video/embed/\d+/
) )
(?P<id>\d+) (?P<id>\d+)
''' '''

View File

@ -4,29 +4,30 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str
from ..utils import ( from ..utils import (
dict_get,
int_or_none, int_or_none,
parse_iso8601, try_get,
) )
class ABCOTVSIE(InfoExtractor): class ABCOTVSIE(InfoExtractor):
IE_NAME = 'abcotvs' IE_NAME = 'abcotvs'
IE_DESC = 'ABC Owned Television Stations' IE_DESC = 'ABC Owned Television Stations'
_VALID_URL = r'https?://(?:abc(?:7(?:news|ny|chicago)?|11|13|30)|6abc)\.com(?:/[^/]+/(?P<display_id>[^/]+))?/(?P<id>\d+)' _VALID_URL = r'https?://(?P<site>abc(?:7(?:news|ny|chicago)?|11|13|30)|6abc)\.com(?:(?:/[^/]+)*/(?P<display_id>[^/]+))?/(?P<id>\d+)'
_TESTS = [ _TESTS = [
{ {
'url': 'http://abc7news.com/entertainment/east-bay-museum-celebrates-vintage-synthesizers/472581/', 'url': 'http://abc7news.com/entertainment/east-bay-museum-celebrates-vintage-synthesizers/472581/',
'info_dict': { 'info_dict': {
'id': '472581', 'id': '472548',
'display_id': 'east-bay-museum-celebrates-vintage-synthesizers', 'display_id': 'east-bay-museum-celebrates-vintage-synthesizers',
'ext': 'mp4', 'ext': 'mp4',
'title': 'East Bay museum celebrates vintage synthesizers', 'title': 'East Bay museum celebrates synthesized music',
'description': 'md5:24ed2bd527096ec2a5c67b9d5a9005f3', 'description': 'md5:24ed2bd527096ec2a5c67b9d5a9005f3',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1421123075, 'timestamp': 1421118520,
'upload_date': '20150113', 'upload_date': '20150113',
'uploader': 'Jonathan Bloom',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
@ -37,39 +38,63 @@ class ABCOTVSIE(InfoExtractor):
'url': 'http://abc7news.com/472581', 'url': 'http://abc7news.com/472581',
'only_matching': True, 'only_matching': True,
}, },
{
'url': 'https://6abc.com/man-75-killed-after-being-struck-by-vehicle-in-chester/5725182/',
'only_matching': True,
},
] ]
_SITE_MAP = {
'6abc': 'wpvi',
'abc11': 'wtvd',
'abc13': 'ktrk',
'abc30': 'kfsn',
'abc7': 'kabc',
'abc7chicago': 'wls',
'abc7news': 'kgo',
'abc7ny': 'wabc',
}
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) site, display_id, video_id = re.match(self._VALID_URL, url).groups()
video_id = mobj.group('id') display_id = display_id or video_id
display_id = mobj.group('display_id') or video_id station = self._SITE_MAP[site]
webpage = self._download_webpage(url, display_id) data = self._download_json(
'https://api.abcotvs.com/v2/content', display_id, query={
'id': video_id,
'key': 'otv.web.%s.story' % station,
'station': station,
})['data']
video = try_get(data, lambda x: x['featuredMedia']['video'], dict) or data
video_id = compat_str(dict_get(video, ('id', 'publishedKey'), video_id))
title = video.get('title') or video['linkText']
m3u8 = self._html_search_meta( formats = []
'contentURL', webpage, 'm3u8 url', fatal=True).split('?')[0] m3u8_url = video.get('m3u8')
if m3u8_url:
formats = self._extract_m3u8_formats(m3u8, display_id, 'mp4') formats = self._extract_m3u8_formats(
video['m3u8'].split('?')[0], display_id, 'mp4', m3u8_id='hls', fatal=False)
mp4_url = video.get('mp4')
if mp4_url:
formats.append({
'abr': 128,
'format_id': 'https',
'height': 360,
'url': mp4_url,
'width': 640,
})
self._sort_formats(formats) self._sort_formats(formats)
title = self._og_search_title(webpage).strip() image = video.get('image') or {}
description = self._og_search_description(webpage).strip()
thumbnail = self._og_search_thumbnail(webpage)
timestamp = parse_iso8601(self._search_regex(
r'<div class="meta">\s*<time class="timeago" datetime="([^"]+)">',
webpage, 'upload date', fatal=False))
uploader = self._search_regex(
r'rel="author">([^<]+)</a>',
webpage, 'uploader', default=None)
return { return {
'id': video_id, 'id': video_id,
'display_id': display_id, 'display_id': display_id,
'title': title, 'title': title,
'description': description, 'description': dict_get(video, ('description', 'caption'), try_get(video, lambda x: x['meta']['description'])),
'thumbnail': thumbnail, 'thumbnail': dict_get(image, ('source', 'dynamicSource')),
'timestamp': timestamp, 'timestamp': int_or_none(video.get('date')),
'uploader': uploader, 'duration': int_or_none(video.get('length')),
'formats': formats, 'formats': formats,
} }

View File

@ -7,6 +7,7 @@ import functools
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str from ..compat import compat_str
from ..utils import ( from ..utils import (
clean_html,
float_or_none, float_or_none,
int_or_none, int_or_none,
try_get, try_get,
@ -27,7 +28,7 @@ class ACastIE(InfoExtractor):
''' '''
_TESTS = [{ _TESTS = [{
'url': 'https://www.acast.com/sparpodcast/2.raggarmordet-rosterurdetforflutna', 'url': 'https://www.acast.com/sparpodcast/2.raggarmordet-rosterurdetforflutna',
'md5': 'a02393c74f3bdb1801c3ec2695577ce0', 'md5': '16d936099ec5ca2d5869e3a813ee8dc4',
'info_dict': { 'info_dict': {
'id': '2a92b283-1a75-4ad8-8396-499c641de0d9', 'id': '2a92b283-1a75-4ad8-8396-499c641de0d9',
'ext': 'mp3', 'ext': 'mp3',
@ -46,28 +47,37 @@ class ACastIE(InfoExtractor):
}, { }, {
'url': 'https://play.acast.com/s/rattegangspodden/s04e09-styckmordet-i-helenelund-del-22', 'url': 'https://play.acast.com/s/rattegangspodden/s04e09-styckmordet-i-helenelund-del-22',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://play.acast.com/s/sparpodcast/2a92b283-1a75-4ad8-8396-499c641de0d9',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
channel, display_id = re.match(self._VALID_URL, url).groups() channel, display_id = re.match(self._VALID_URL, url).groups()
s = self._download_json( s = self._download_json(
'https://play-api.acast.com/stitch/%s/%s' % (channel, display_id), 'https://feeder.acast.com/api/v1/shows/%s/episodes/%s' % (channel, display_id),
display_id)['result'] display_id)
media_url = s['url'] media_url = s['url']
if re.search(r'[0-9a-f]{8}-(?:[0-9a-f]{4}-){3}[0-9a-f]{12}', display_id):
episode_url = s.get('episodeUrl')
if episode_url:
display_id = episode_url
else:
channel, display_id = re.match(self._VALID_URL, s['link']).groups()
cast_data = self._download_json( cast_data = self._download_json(
'https://play-api.acast.com/splash/%s/%s' % (channel, display_id), 'https://play-api.acast.com/splash/%s/%s' % (channel, display_id),
display_id)['result'] display_id)['result']
e = cast_data['episode'] e = cast_data['episode']
title = e['name'] title = e.get('name') or s['title']
return { return {
'id': compat_str(e['id']), 'id': compat_str(e['id']),
'display_id': display_id, 'display_id': display_id,
'url': media_url, 'url': media_url,
'title': title, 'title': title,
'description': e.get('description') or e.get('summary'), 'description': e.get('summary') or clean_html(e.get('description') or s.get('description')),
'thumbnail': e.get('image'), 'thumbnail': e.get('image'),
'timestamp': unified_timestamp(e.get('publishingDate')), 'timestamp': unified_timestamp(e.get('publishingDate') or s.get('publishDate')),
'duration': float_or_none(s.get('duration') or e.get('duration')), 'duration': float_or_none(e.get('duration') or s.get('duration')),
'filesize': int_or_none(e.get('contentLength')), 'filesize': int_or_none(e.get('contentLength')),
'creator': try_get(cast_data, lambda x: x['show']['author'], compat_str), 'creator': try_get(cast_data, lambda x: x['show']['author'], compat_str),
'series': try_get(cast_data, lambda x: x['show']['name'], compat_str), 'series': try_get(cast_data, lambda x: x['show']['name'], compat_str),

View File

@ -1,95 +0,0 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_HTTPError,
compat_str,
compat_urllib_parse_urlencode,
compat_urllib_parse_urlparse,
)
from ..utils import (
ExtractorError,
qualities,
)
class AddAnimeIE(InfoExtractor):
_VALID_URL = r'https?://(?:\w+\.)?add-anime\.net/(?:watch_video\.php\?(?:.*?)v=|video/)(?P<id>[\w_]+)'
_TESTS = [{
'url': 'http://www.add-anime.net/watch_video.php?v=24MR3YO5SAS9',
'md5': '72954ea10bc979ab5e2eb288b21425a0',
'info_dict': {
'id': '24MR3YO5SAS9',
'ext': 'mp4',
'description': 'One Piece 606',
'title': 'One Piece 606',
},
'skip': 'Video is gone',
}, {
'url': 'http://add-anime.net/video/MDUGWYKNGBD8/One-Piece-687',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
try:
webpage = self._download_webpage(url, video_id)
except ExtractorError as ee:
if not isinstance(ee.cause, compat_HTTPError) or \
ee.cause.code != 503:
raise
redir_webpage = ee.cause.read().decode('utf-8')
action = self._search_regex(
r'<form id="challenge-form" action="([^"]+)"',
redir_webpage, 'Redirect form')
vc = self._search_regex(
r'<input type="hidden" name="jschl_vc" value="([^"]+)"/>',
redir_webpage, 'redirect vc value')
av = re.search(
r'a\.value = ([0-9]+)[+]([0-9]+)[*]([0-9]+);',
redir_webpage)
if av is None:
raise ExtractorError('Cannot find redirect math task')
av_res = int(av.group(1)) + int(av.group(2)) * int(av.group(3))
parsed_url = compat_urllib_parse_urlparse(url)
av_val = av_res + len(parsed_url.netloc)
confirm_url = (
parsed_url.scheme + '://' + parsed_url.netloc +
action + '?' +
compat_urllib_parse_urlencode({
'jschl_vc': vc, 'jschl_answer': compat_str(av_val)}))
self._download_webpage(
confirm_url, video_id,
note='Confirming after redirect')
webpage = self._download_webpage(url, video_id)
FORMATS = ('normal', 'hq')
quality = qualities(FORMATS)
formats = []
for format_id in FORMATS:
rex = r"var %s_video_file = '(.*?)';" % re.escape(format_id)
video_url = self._search_regex(rex, webpage, 'video file URLx',
fatal=False)
if not video_url:
continue
formats.append({
'format_id': format_id,
'url': video_url,
'quality': quality(format_id),
})
self._sort_formats(formats)
video_title = self._og_search_title(webpage)
video_description = self._og_search_description(webpage)
return {
'_type': 'video',
'id': video_id,
'formats': formats,
'title': video_title,
'description': video_description
}

View File

@ -21,7 +21,6 @@ from ..utils import (
intlist_to_bytes, intlist_to_bytes,
long_to_bytes, long_to_bytes,
pkcs1pad, pkcs1pad,
srt_subtitles_timecode,
strip_or_none, strip_or_none,
urljoin, urljoin,
) )
@ -42,6 +41,18 @@ class ADNIE(InfoExtractor):
} }
_BASE_URL = 'http://animedigitalnetwork.fr' _BASE_URL = 'http://animedigitalnetwork.fr'
_RSA_KEY = (0xc35ae1e4356b65a73b551493da94b8cb443491c0aa092a357a5aee57ffc14dda85326f42d716e539a34542a0d3f363adf16c5ec222d713d5997194030ee2e4f0d1fb328c01a81cf6868c090d50de8e169c6b13d1675b9eeed1cbc51e1fffca9b38af07f37abd790924cd3bee59d0257cfda4fe5f3f0534877e21ce5821447d1b, 65537) _RSA_KEY = (0xc35ae1e4356b65a73b551493da94b8cb443491c0aa092a357a5aee57ffc14dda85326f42d716e539a34542a0d3f363adf16c5ec222d713d5997194030ee2e4f0d1fb328c01a81cf6868c090d50de8e169c6b13d1675b9eeed1cbc51e1fffca9b38af07f37abd790924cd3bee59d0257cfda4fe5f3f0534877e21ce5821447d1b, 65537)
_POS_ALIGN_MAP = {
'start': 1,
'end': 3,
}
_LINE_ALIGN_MAP = {
'middle': 8,
'end': 4,
}
@staticmethod
def _ass_subtitles_timecode(seconds):
return '%01d:%02d:%02d.%02d' % (seconds / 3600, (seconds % 3600) / 60, seconds % 60, (seconds % 1) * 100)
def _get_subtitles(self, sub_path, video_id): def _get_subtitles(self, sub_path, video_id):
if not sub_path: if not sub_path:
@ -49,14 +60,20 @@ class ADNIE(InfoExtractor):
enc_subtitles = self._download_webpage( enc_subtitles = self._download_webpage(
urljoin(self._BASE_URL, sub_path), urljoin(self._BASE_URL, sub_path),
video_id, fatal=False) video_id, 'Downloading subtitles location', fatal=False) or '{}'
subtitle_location = (self._parse_json(enc_subtitles, video_id, fatal=False) or {}).get('location')
if subtitle_location:
enc_subtitles = self._download_webpage(
urljoin(self._BASE_URL, subtitle_location),
video_id, 'Downloading subtitles data', fatal=False,
headers={'Origin': 'https://animedigitalnetwork.fr'})
if not enc_subtitles: if not enc_subtitles:
return None return None
# http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js # http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js
dec_subtitles = intlist_to_bytes(aes_cbc_decrypt( dec_subtitles = intlist_to_bytes(aes_cbc_decrypt(
bytes_to_intlist(compat_b64decode(enc_subtitles[24:])), bytes_to_intlist(compat_b64decode(enc_subtitles[24:])),
bytes_to_intlist(binascii.unhexlify(self._K + '9032ad7083106400')), bytes_to_intlist(binascii.unhexlify(self._K + '4b8ef13ec1872730')),
bytes_to_intlist(compat_b64decode(enc_subtitles[:24])) bytes_to_intlist(compat_b64decode(enc_subtitles[:24]))
)) ))
subtitles_json = self._parse_json( subtitles_json = self._parse_json(
@ -67,23 +84,27 @@ class ADNIE(InfoExtractor):
subtitles = {} subtitles = {}
for sub_lang, sub in subtitles_json.items(): for sub_lang, sub in subtitles_json.items():
srt = '' ssa = '''[Script Info]
for num, current in enumerate(sub): ScriptType:V4.00
start, end, text = ( [V4 Styles]
Format: Name,Fontname,Fontsize,PrimaryColour,SecondaryColour,TertiaryColour,BackColour,Bold,Italic,BorderStyle,Outline,Shadow,Alignment,MarginL,MarginR,MarginV,AlphaLevel,Encoding
Style: Default,Arial,18,16777215,16777215,16777215,0,-1,0,1,1,0,2,20,20,20,0,0
[Events]
Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
for current in sub:
start, end, text, line_align, position_align = (
float_or_none(current.get('startTime')), float_or_none(current.get('startTime')),
float_or_none(current.get('endTime')), float_or_none(current.get('endTime')),
current.get('text')) current.get('text'), current.get('lineAlign'),
current.get('positionAlign'))
if start is None or end is None or text is None: if start is None or end is None or text is None:
continue continue
srt += os.linesep.join( alignment = self._POS_ALIGN_MAP.get(position_align, 2) + self._LINE_ALIGN_MAP.get(line_align, 0)
( ssa += os.linesep + 'Dialogue: Marked=0,%s,%s,Default,,0,0,0,,%s%s' % (
'%d' % num, self._ass_subtitles_timecode(start),
'%s --> %s' % ( self._ass_subtitles_timecode(end),
srt_subtitles_timecode(start), '{\\a%d}' % alignment if alignment != 2 else '',
srt_subtitles_timecode(end)), text.replace('\n', '\\N').replace('<i>', '{\\i1}').replace('</i>', '{\\i0}'))
text,
os.linesep,
))
if sub_lang == 'vostf': if sub_lang == 'vostf':
sub_lang = 'fr' sub_lang = 'fr'
@ -91,8 +112,8 @@ class ADNIE(InfoExtractor):
'ext': 'json', 'ext': 'json',
'data': json.dumps(sub), 'data': json.dumps(sub),
}, { }, {
'ext': 'srt', 'ext': 'ssa',
'data': srt, 'data': ssa,
}]) }])
return subtitles return subtitles
@ -100,7 +121,15 @@ class ADNIE(InfoExtractor):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
player_config = self._parse_json(self._search_regex( player_config = self._parse_json(self._search_regex(
r'playerConfig\s*=\s*({.+});', webpage, 'player config'), video_id) r'playerConfig\s*=\s*({.+});', webpage,
'player config', default='{}'), video_id, fatal=False)
if not player_config:
config_url = urljoin(self._BASE_URL, self._search_regex(
r'(?:id="player"|class="[^"]*adn-player-container[^"]*")[^>]+data-url="([^"]+)"',
webpage, 'config url'))
player_config = self._download_json(
config_url, video_id,
'Downloading player config JSON metadata')['player']
video_info = {} video_info = {}
video_info_str = self._search_regex( video_info_str = self._search_regex(
@ -129,12 +158,15 @@ class ADNIE(InfoExtractor):
encrypted_message = long_to_bytes(pow(bytes_to_long(padded_message), e, n)) encrypted_message = long_to_bytes(pow(bytes_to_long(padded_message), e, n))
authorization = base64.b64encode(encrypted_message).decode() authorization = base64.b64encode(encrypted_message).decode()
links_data = self._download_json( links_data = self._download_json(
urljoin(self._BASE_URL, links_url), video_id, headers={ urljoin(self._BASE_URL, links_url), video_id,
'Downloading links JSON metadata', headers={
'Authorization': 'Bearer ' + authorization, 'Authorization': 'Bearer ' + authorization,
}) })
links = links_data.get('links') or {} links = links_data.get('links') or {}
metas = metas or links_data.get('meta') or {} metas = metas or links_data.get('meta') or {}
sub_path = (sub_path or links_data.get('subtitles')) + '&token=' + token sub_path = sub_path or links_data.get('subtitles') or \
'index.php?option=com_vodapi&task=subtitles.getJSON&format=json&id=' + video_id
sub_path += '&token=' + token
error = links_data.get('error') error = links_data.get('error')
title = metas.get('title') or video_info['title'] title = metas.get('title') or video_info['title']
@ -142,9 +174,11 @@ class ADNIE(InfoExtractor):
for format_id, qualities in links.items(): for format_id, qualities in links.items():
if not isinstance(qualities, dict): if not isinstance(qualities, dict):
continue continue
for load_balancer_url in qualities.values(): for quality, load_balancer_url in qualities.items():
load_balancer_data = self._download_json( load_balancer_data = self._download_json(
load_balancer_url, video_id, fatal=False) or {} load_balancer_url, video_id,
'Downloading %s %s JSON metadata' % (format_id, quality),
fatal=False) or {}
m3u8_url = load_balancer_data.get('location') m3u8_url = load_balancer_data.get('location')
if not m3u8_url: if not m3u8_url:
continue continue

View File

@ -0,0 +1,37 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import (
compat_parse_qs,
compat_urlparse,
)
class AdobeConnectIE(InfoExtractor):
_VALID_URL = r'https?://\w+\.adobeconnect\.com/(?P<id>[\w-]+)'
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = self._html_search_regex(r'<title>(.+?)</title>', webpage, 'title')
qs = compat_parse_qs(self._search_regex(r"swfUrl\s*=\s*'([^']+)'", webpage, 'swf url').split('?')[1])
is_live = qs.get('isLive', ['false'])[0] == 'true'
formats = []
for con_string in qs['conStrings'][0].split(','):
formats.append({
'format_id': con_string.split('://')[0],
'app': compat_urlparse.quote('?' + con_string.split('?')[1] + 'flvplayerapp/' + qs['appInstance'][0]),
'ext': 'flv',
'play_path': 'mp4:' + qs['streamName'][0],
'rtmp_conn': 'S:' + qs['ticket'][0],
'rtmp_live': is_live,
'url': con_string,
})
return {
'id': video_id,
'title': self._live_title(title) if is_live else title,
'formats': formats,
'is_live': is_live,
}

View File

@ -25,6 +25,11 @@ MSO_INFO = {
'username_field': 'username', 'username_field': 'username',
'password_field': 'password', 'password_field': 'password',
}, },
'ATT': {
'name': 'AT&T U-verse',
'username_field': 'userid',
'password_field': 'password',
},
'ATTOTT': { 'ATTOTT': {
'name': 'DIRECTV NOW', 'name': 'DIRECTV NOW',
'username_field': 'email', 'username_field': 'email',

View File

@ -1,25 +1,119 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import functools
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str from ..compat import compat_str
from ..utils import ( from ..utils import (
parse_duration,
unified_strdate,
str_to_int,
int_or_none,
float_or_none, float_or_none,
int_or_none,
ISO639Utils, ISO639Utils,
determine_ext, OnDemandPagedList,
parse_duration,
str_or_none,
str_to_int,
unified_strdate,
) )
class AdobeTVBaseIE(InfoExtractor): class AdobeTVBaseIE(InfoExtractor):
_API_BASE_URL = 'http://tv.adobe.com/api/v4/' def _call_api(self, path, video_id, query, note=None):
return self._download_json(
'http://tv.adobe.com/api/v4/' + path,
video_id, note, query=query)['data']
def _parse_subtitles(self, video_data, url_key):
subtitles = {}
for translation in video_data.get('translations', []):
vtt_path = translation.get(url_key)
if not vtt_path:
continue
lang = translation.get('language_w3c') or ISO639Utils.long2short(translation['language_medium'])
subtitles.setdefault(lang, []).append({
'ext': 'vtt',
'url': vtt_path,
})
return subtitles
def _parse_video_data(self, video_data):
video_id = compat_str(video_data['id'])
title = video_data['title']
s3_extracted = False
formats = []
for source in video_data.get('videos', []):
source_url = source.get('url')
if not source_url:
continue
f = {
'format_id': source.get('quality_level'),
'fps': int_or_none(source.get('frame_rate')),
'height': int_or_none(source.get('height')),
'tbr': int_or_none(source.get('video_data_rate')),
'width': int_or_none(source.get('width')),
'url': source_url,
}
original_filename = source.get('original_filename')
if original_filename:
if not (f.get('height') and f.get('width')):
mobj = re.search(r'_(\d+)x(\d+)', original_filename)
if mobj:
f.update({
'height': int(mobj.group(2)),
'width': int(mobj.group(1)),
})
if original_filename.startswith('s3://') and not s3_extracted:
formats.append({
'format_id': 'original',
'preference': 1,
'url': original_filename.replace('s3://', 'https://s3.amazonaws.com/'),
})
s3_extracted = True
formats.append(f)
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': video_data.get('description'),
'thumbnail': video_data.get('thumbnail'),
'upload_date': unified_strdate(video_data.get('start_date')),
'duration': parse_duration(video_data.get('duration')),
'view_count': str_to_int(video_data.get('playcount')),
'formats': formats,
'subtitles': self._parse_subtitles(video_data, 'vtt'),
}
class AdobeTVEmbedIE(AdobeTVBaseIE):
IE_NAME = 'adobetv:embed'
_VALID_URL = r'https?://tv\.adobe\.com/embed/\d+/(?P<id>\d+)'
_TEST = {
'url': 'https://tv.adobe.com/embed/22/4153',
'md5': 'c8c0461bf04d54574fc2b4d07ac6783a',
'info_dict': {
'id': '4153',
'ext': 'flv',
'title': 'Creating Graphics Optimized for BlackBerry',
'description': 'md5:eac6e8dced38bdaae51cd94447927459',
'thumbnail': r're:https?://.*\.jpg$',
'upload_date': '20091109',
'duration': 377,
'view_count': int,
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
video_data = self._call_api(
'episode/' + video_id, video_id, {'disclosure': 'standard'})[0]
return self._parse_video_data(video_data)
class AdobeTVIE(AdobeTVBaseIE): class AdobeTVIE(AdobeTVBaseIE):
IE_NAME = 'adobetv'
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?watch/(?P<show_urlname>[^/]+)/(?P<id>[^/]+)' _VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?watch/(?P<show_urlname>[^/]+)/(?P<id>[^/]+)'
_TEST = { _TEST = {
@ -42,45 +136,33 @@ class AdobeTVIE(AdobeTVBaseIE):
if not language: if not language:
language = 'en' language = 'en'
video_data = self._download_json( video_data = self._call_api(
self._API_BASE_URL + 'episode/get/?language=%s&show_urlname=%s&urlname=%s&disclosure=standard' % (language, show_urlname, urlname), 'episode/get', urlname, {
urlname)['data'][0] 'disclosure': 'standard',
'language': language,
formats = [{ 'show_urlname': show_urlname,
'url': source['url'], 'urlname': urlname,
'format_id': source.get('quality_level') or source['url'].split('-')[-1].split('.')[0] or None, })[0]
'width': int_or_none(source.get('width')), return self._parse_video_data(video_data)
'height': int_or_none(source.get('height')),
'tbr': int_or_none(source.get('video_data_rate')),
} for source in video_data['videos']]
self._sort_formats(formats)
return {
'id': compat_str(video_data['id']),
'title': video_data['title'],
'description': video_data.get('description'),
'thumbnail': video_data.get('thumbnail'),
'upload_date': unified_strdate(video_data.get('start_date')),
'duration': parse_duration(video_data.get('duration')),
'view_count': str_to_int(video_data.get('playcount')),
'formats': formats,
}
class AdobeTVPlaylistBaseIE(AdobeTVBaseIE): class AdobeTVPlaylistBaseIE(AdobeTVBaseIE):
def _parse_page_data(self, page_data): _PAGE_SIZE = 25
return [self.url_result(self._get_element_url(element_data)) for element_data in page_data]
def _extract_playlist_entries(self, url, display_id): def _fetch_page(self, display_id, query, page):
page = self._download_json(url, display_id) page += 1
entries = self._parse_page_data(page['data']) query['page'] = page
for page_num in range(2, page['paging']['pages'] + 1): for element_data in self._call_api(
entries.extend(self._parse_page_data( self._RESOURCE, display_id, query, 'Download Page %d' % page):
self._download_json(url + '&page=%d' % page_num, display_id)['data'])) yield self._process_data(element_data)
return entries
def _extract_playlist_entries(self, display_id, query):
return OnDemandPagedList(functools.partial(
self._fetch_page, display_id, query), self._PAGE_SIZE)
class AdobeTVShowIE(AdobeTVPlaylistBaseIE): class AdobeTVShowIE(AdobeTVPlaylistBaseIE):
IE_NAME = 'adobetv:show'
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?show/(?P<id>[^/]+)' _VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?show/(?P<id>[^/]+)'
_TEST = { _TEST = {
@ -92,26 +174,31 @@ class AdobeTVShowIE(AdobeTVPlaylistBaseIE):
}, },
'playlist_mincount': 136, 'playlist_mincount': 136,
} }
_RESOURCE = 'episode'
def _get_element_url(self, element_data): _process_data = AdobeTVBaseIE._parse_video_data
return element_data['urls'][0]
def _real_extract(self, url): def _real_extract(self, url):
language, show_urlname = re.match(self._VALID_URL, url).groups() language, show_urlname = re.match(self._VALID_URL, url).groups()
if not language: if not language:
language = 'en' language = 'en'
query = 'language=%s&show_urlname=%s' % (language, show_urlname) query = {
'disclosure': 'standard',
'language': language,
'show_urlname': show_urlname,
}
show_data = self._download_json(self._API_BASE_URL + 'show/get/?%s' % query, show_urlname)['data'][0] show_data = self._call_api(
'show/get', show_urlname, query)[0]
return self.playlist_result( return self.playlist_result(
self._extract_playlist_entries(self._API_BASE_URL + 'episode/?%s' % query, show_urlname), self._extract_playlist_entries(show_urlname, query),
compat_str(show_data['id']), str_or_none(show_data.get('id')),
show_data['show_name'], show_data.get('show_name'),
show_data['show_description']) show_data.get('show_description'))
class AdobeTVChannelIE(AdobeTVPlaylistBaseIE): class AdobeTVChannelIE(AdobeTVPlaylistBaseIE):
IE_NAME = 'adobetv:channel'
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?channel/(?P<id>[^/]+)(?:/(?P<category_urlname>[^/]+))?' _VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?channel/(?P<id>[^/]+)(?:/(?P<category_urlname>[^/]+))?'
_TEST = { _TEST = {
@ -121,24 +208,30 @@ class AdobeTVChannelIE(AdobeTVPlaylistBaseIE):
}, },
'playlist_mincount': 96, 'playlist_mincount': 96,
} }
_RESOURCE = 'show'
def _get_element_url(self, element_data): def _process_data(self, show_data):
return element_data['url'] return self.url_result(
show_data['url'], 'AdobeTVShow', str_or_none(show_data.get('id')))
def _real_extract(self, url): def _real_extract(self, url):
language, channel_urlname, category_urlname = re.match(self._VALID_URL, url).groups() language, channel_urlname, category_urlname = re.match(self._VALID_URL, url).groups()
if not language: if not language:
language = 'en' language = 'en'
query = 'language=%s&channel_urlname=%s' % (language, channel_urlname) query = {
'channel_urlname': channel_urlname,
'language': language,
}
if category_urlname: if category_urlname:
query += '&category_urlname=%s' % category_urlname query['category_urlname'] = category_urlname
return self.playlist_result( return self.playlist_result(
self._extract_playlist_entries(self._API_BASE_URL + 'show/?%s' % query, channel_urlname), self._extract_playlist_entries(channel_urlname, query),
channel_urlname) channel_urlname)
class AdobeTVVideoIE(InfoExtractor): class AdobeTVVideoIE(AdobeTVBaseIE):
IE_NAME = 'adobetv:video'
_VALID_URL = r'https?://video\.tv\.adobe\.com/v/(?P<id>\d+)' _VALID_URL = r'https?://video\.tv\.adobe\.com/v/(?P<id>\d+)'
_TEST = { _TEST = {
@ -160,38 +253,36 @@ class AdobeTVVideoIE(InfoExtractor):
video_data = self._parse_json(self._search_regex( video_data = self._parse_json(self._search_regex(
r'var\s+bridge\s*=\s*([^;]+);', webpage, 'bridged data'), video_id) r'var\s+bridge\s*=\s*([^;]+);', webpage, 'bridged data'), video_id)
title = video_data['title']
formats = [{ formats = []
'format_id': '%s-%s' % (determine_ext(source['src']), source.get('height')), sources = video_data.get('sources') or []
'url': source['src'], for source in sources:
'width': int_or_none(source.get('width')), source_src = source.get('src')
'height': int_or_none(source.get('height')), if not source_src:
'tbr': int_or_none(source.get('bitrate')), continue
} for source in video_data['sources']] formats.append({
'filesize': int_or_none(source.get('kilobytes') or None, invscale=1000),
'format_id': '-'.join(filter(None, [source.get('format'), source.get('label')])),
'height': int_or_none(source.get('height') or None),
'tbr': int_or_none(source.get('bitrate') or None),
'width': int_or_none(source.get('width') or None),
'url': source_src,
})
self._sort_formats(formats) self._sort_formats(formats)
# For both metadata and downloaded files the duration varies among # For both metadata and downloaded files the duration varies among
# formats. I just pick the max one # formats. I just pick the max one
duration = max(filter(None, [ duration = max(filter(None, [
float_or_none(source.get('duration'), scale=1000) float_or_none(source.get('duration'), scale=1000)
for source in video_data['sources']])) for source in sources]))
subtitles = {}
for translation in video_data.get('translations', []):
lang_id = translation.get('language_w3c') or ISO639Utils.long2short(translation['language_medium'])
if lang_id not in subtitles:
subtitles[lang_id] = []
subtitles[lang_id].append({
'url': translation['vttPath'],
'ext': 'vtt',
})
return { return {
'id': video_id, 'id': video_id,
'formats': formats, 'formats': formats,
'title': video_data['title'], 'title': title,
'description': video_data.get('description'), 'description': video_data.get('description'),
'thumbnail': video_data['video'].get('poster'), 'thumbnail': video_data.get('video', {}).get('poster'),
'duration': duration, 'duration': duration,
'subtitles': subtitles, 'subtitles': self._parse_subtitles(video_data, 'vttPath'),
} }

View File

@ -1,13 +1,19 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import json
import re import re
from .turner import TurnerBaseIE from .turner import TurnerBaseIE
from ..utils import ( from ..utils import (
determine_ext,
float_or_none,
int_or_none, int_or_none,
mimetype2ext,
parse_age_limit,
parse_iso8601,
strip_or_none, strip_or_none,
url_or_none, try_get,
) )
@ -21,8 +27,8 @@ class AdultSwimIE(TurnerBaseIE):
'ext': 'mp4', 'ext': 'mp4',
'title': 'Rick and Morty - Pilot', 'title': 'Rick and Morty - Pilot',
'description': 'Rick moves in with his daughter\'s family and establishes himself as a bad influence on his grandson, Morty.', 'description': 'Rick moves in with his daughter\'s family and establishes himself as a bad influence on his grandson, Morty.',
'timestamp': 1493267400, 'timestamp': 1543294800,
'upload_date': '20170427', 'upload_date': '20181127',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
@ -43,6 +49,7 @@ class AdultSwimIE(TurnerBaseIE):
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
}, },
'skip': '404 Not Found',
}, { }, {
'url': 'http://www.adultswim.com/videos/decker/inside-decker-a-new-hero/', 'url': 'http://www.adultswim.com/videos/decker/inside-decker-a-new-hero/',
'info_dict': { 'info_dict': {
@ -61,9 +68,9 @@ class AdultSwimIE(TurnerBaseIE):
}, { }, {
'url': 'http://www.adultswim.com/videos/attack-on-titan', 'url': 'http://www.adultswim.com/videos/attack-on-titan',
'info_dict': { 'info_dict': {
'id': 'b7A69dzfRzuaXIECdxW8XQ', 'id': 'attack-on-titan',
'title': 'Attack on Titan', 'title': 'Attack on Titan',
'description': 'md5:6c8e003ea0777b47013e894767f5e114', 'description': 'md5:41caa9416906d90711e31dc00cb7db7e',
}, },
'playlist_mincount': 12, 'playlist_mincount': 12,
}, { }, {
@ -78,83 +85,118 @@ class AdultSwimIE(TurnerBaseIE):
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
}, },
'skip': '404 Not Found',
}] }]
def _real_extract(self, url): def _real_extract(self, url):
show_path, episode_path = re.match(self._VALID_URL, url).groups() show_path, episode_path = re.match(self._VALID_URL, url).groups()
display_id = episode_path or show_path display_id = episode_path or show_path
webpage = self._download_webpage(url, display_id) query = '''query {
initial_data = self._parse_json(self._search_regex( getShowBySlug(slug:"%s") {
r'AS_INITIAL_DATA(?:__)?\s*=\s*({.+?});', %%s
webpage, 'initial data'), display_id) }
}''' % show_path
is_stream = show_path == 'streams' if episode_path:
if is_stream: query = query % '''title
if not episode_path: getVideoBySlug(slug:"%s") {
episode_path = 'live-stream' _id
auth
video_data = next(stream for stream_path, stream in initial_data['streams'].items() if stream_path == episode_path) description
video_id = video_data.get('stream') duration
episodeNumber
if not video_id: launchDate
entries = [] mediaID
for episode in video_data.get('archiveEpisodes', []): seasonNumber
episode_url = url_or_none(episode.get('url')) poster
if not episode_url: title
continue tvRating
entries.append(self.url_result( }''' % episode_path
episode_url, 'AdultSwim', episode.get('id'))) ['getVideoBySlug']
return self.playlist_result(
entries, video_data.get('id'), video_data.get('title'),
strip_or_none(video_data.get('description')))
else: else:
show_data = initial_data['show'] query = query % '''metaDescription
title
videos(first:1000,sort:["episode_number"]) {
edges {
node {
_id
slug
}
}
}'''
show_data = self._download_json(
'https://www.adultswim.com/api/search', display_id,
data=json.dumps({'query': query}).encode(),
headers={'Content-Type': 'application/json'})['data']['getShowBySlug']
if episode_path:
video_data = show_data['getVideoBySlug']
video_id = video_data['_id']
episode_title = title = video_data['title']
series = show_data.get('title')
if series:
title = '%s - %s' % (series, title)
info = {
'id': video_id,
'title': title,
'description': strip_or_none(video_data.get('description')),
'duration': float_or_none(video_data.get('duration')),
'formats': [],
'subtitles': {},
'age_limit': parse_age_limit(video_data.get('tvRating')),
'thumbnail': video_data.get('poster'),
'timestamp': parse_iso8601(video_data.get('launchDate')),
'series': series,
'season_number': int_or_none(video_data.get('seasonNumber')),
'episode': episode_title,
'episode_number': int_or_none(video_data.get('episodeNumber')),
}
if not episode_path: auth = video_data.get('auth')
media_id = video_data.get('mediaID')
if media_id:
info.update(self._extract_ngtv_info(media_id, {
# CDN_TOKEN_APP_ID from:
# https://d2gg02c3xr550i.cloudfront.net/assets/asvp.e9c8bef24322d060ef87.bundle.js
'appId': 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhcHBJZCI6ImFzLXR2ZS1kZXNrdG9wLXB0enQ2bSIsInByb2R1Y3QiOiJ0dmUiLCJuZXR3b3JrIjoiYXMiLCJwbGF0Zm9ybSI6ImRlc2t0b3AiLCJpYXQiOjE1MzI3MDIyNzl9.BzSCk-WYOZ2GMCIaeVb8zWnzhlgnXuJTCu0jGp_VaZE',
}, {
'url': url,
'site_name': 'AdultSwim',
'auth_required': auth,
}))
if not auth:
extract_data = self._download_json(
'https://www.adultswim.com/api/shows/v1/videos/' + video_id,
video_id, query={'fields': 'stream'}, fatal=False) or {}
assets = try_get(extract_data, lambda x: x['data']['video']['stream']['assets'], list) or []
for asset in assets:
asset_url = asset.get('url')
if not asset_url:
continue
ext = determine_ext(asset_url, mimetype2ext(asset.get('mime_type')))
if ext == 'm3u8':
info['formats'].extend(self._extract_m3u8_formats(
asset_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
elif ext == 'f4m':
continue
# info['formats'].extend(self._extract_f4m_formats(
# asset_url, video_id, f4m_id='hds', fatal=False))
elif ext in ('scc', 'ttml', 'vtt'):
info['subtitles'].setdefault('en', []).append({
'url': asset_url,
})
self._sort_formats(info['formats'])
return info
else:
entries = [] entries = []
for video in show_data.get('videos', []): for edge in show_data.get('videos', {}).get('edges', []):
video = edge.get('node') or {}
slug = video.get('slug') slug = video.get('slug')
if not slug: if not slug:
continue continue
entries.append(self.url_result( entries.append(self.url_result(
'http://adultswim.com/videos/%s/%s' % (show_path, slug), 'http://adultswim.com/videos/%s/%s' % (show_path, slug),
'AdultSwim', video.get('id'))) 'AdultSwim', video.get('_id')))
return self.playlist_result( return self.playlist_result(
entries, show_data.get('id'), show_data.get('title'), entries, show_path, show_data.get('title'),
strip_or_none(show_data.get('metadata', {}).get('description'))) strip_or_none(show_data.get('metaDescription')))
video_data = show_data['sluggedVideo']
video_id = video_data['id']
info = self._extract_cvp_info(
'http://www.adultswim.com/videos/api/v0/assets?platform=desktop&id=' + video_id,
video_id, {
'secure': {
'media_src': 'http://androidhls-secure.cdn.turner.com/adultswim/big',
'tokenizer_src': 'http://www.adultswim.com/astv/mvpd/processors/services/token_ipadAdobe.do',
},
}, {
'url': url,
'site_name': 'AdultSwim',
'auth_required': video_data.get('auth'),
})
info.update({
'id': video_id,
'display_id': display_id,
'description': info.get('description') or strip_or_none(video_data.get('description')),
})
if not is_stream:
info.update({
'duration': info.get('duration') or int_or_none(video_data.get('duration')),
'timestamp': info.get('timestamp') or int_or_none(video_data.get('launch_date')),
'season_number': info.get('season_number') or int_or_none(video_data.get('season_number')),
'episode': info['title'],
'episode_number': info.get('episode_number') or int_or_none(video_data.get('episode_number')),
})
info['series'] = video_data.get('collection_title') or info.get('series')
if info['series'] and info['series'] != info['title']:
info['title'] = '%s - %s' % (info['series'], info['title'])
return info

View File

@ -1,14 +1,15 @@
# coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re import re
from .theplatform import ThePlatformIE from .theplatform import ThePlatformIE
from ..utils import ( from ..utils import (
extract_attributes,
ExtractorError,
int_or_none,
smuggle_url, smuggle_url,
update_url_query, update_url_query,
unescapeHTML,
extract_attributes,
get_element_by_attribute,
) )
from ..compat import ( from ..compat import (
compat_urlparse, compat_urlparse,
@ -19,6 +20,43 @@ class AENetworksBaseIE(ThePlatformIE):
_THEPLATFORM_KEY = 'crazyjava' _THEPLATFORM_KEY = 'crazyjava'
_THEPLATFORM_SECRET = 's3cr3t' _THEPLATFORM_SECRET = 's3cr3t'
def _extract_aen_smil(self, smil_url, video_id, auth=None):
query = {'mbr': 'true'}
if auth:
query['auth'] = auth
TP_SMIL_QUERY = [{
'assetTypes': 'high_video_ak',
'switch': 'hls_high_ak'
}, {
'assetTypes': 'high_video_s3'
}, {
'assetTypes': 'high_video_s3',
'switch': 'hls_ingest_fastly'
}]
formats = []
subtitles = {}
last_e = None
for q in TP_SMIL_QUERY:
q.update(query)
m_url = update_url_query(smil_url, q)
m_url = self._sign_url(m_url, self._THEPLATFORM_KEY, self._THEPLATFORM_SECRET)
try:
tp_formats, tp_subtitles = self._extract_theplatform_smil(
m_url, video_id, 'Downloading %s SMIL data' % (q.get('switch') or q['assetTypes']))
except ExtractorError as e:
last_e = e
continue
formats.extend(tp_formats)
subtitles = self._merge_subtitles(subtitles, tp_subtitles)
if last_e and not formats:
raise last_e
self._sort_formats(formats)
return {
'id': video_id,
'formats': formats,
'subtitles': subtitles,
}
class AENetworksIE(AENetworksBaseIE): class AENetworksIE(AENetworksBaseIE):
IE_NAME = 'aenetworks' IE_NAME = 'aenetworks'
@ -33,22 +71,25 @@ class AENetworksIE(AENetworksBaseIE):
(?: (?:
shows/(?P<show_path>[^/]+(?:/[^/]+){0,2})| shows/(?P<show_path>[^/]+(?:/[^/]+){0,2})|
movies/(?P<movie_display_id>[^/]+)(?:/full-movie)?| movies/(?P<movie_display_id>[^/]+)(?:/full-movie)?|
specials/(?P<special_display_id>[^/]+)/full-special| specials/(?P<special_display_id>[^/]+)/(?:full-special|preview-)|
collections/[^/]+/(?P<collection_display_id>[^/]+) collections/[^/]+/(?P<collection_display_id>[^/]+)
) )
''' '''
_TESTS = [{ _TESTS = [{
'url': 'http://www.history.com/shows/mountain-men/season-1/episode-1', 'url': 'http://www.history.com/shows/mountain-men/season-1/episode-1',
'md5': 'a97a65f7e823ae10e9244bc5433d5fe6',
'info_dict': { 'info_dict': {
'id': '22253814', 'id': '22253814',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Winter Is Coming', 'title': 'Winter is Coming',
'description': 'md5:641f424b7a19d8e24f26dea22cf59d74', 'description': 'md5:641f424b7a19d8e24f26dea22cf59d74',
'timestamp': 1338306241, 'timestamp': 1338306241,
'upload_date': '20120529', 'upload_date': '20120529',
'uploader': 'AENE-NEW', 'uploader': 'AENE-NEW',
}, },
'params': {
# m3u8 download
'skip_download': True,
},
'add_ie': ['ThePlatform'], 'add_ie': ['ThePlatform'],
}, { }, {
'url': 'http://www.history.com/shows/ancient-aliens/season-1', 'url': 'http://www.history.com/shows/ancient-aliens/season-1',
@ -84,6 +125,9 @@ class AENetworksIE(AENetworksBaseIE):
}, { }, {
'url': 'https://www.historyvault.com/collections/america-the-story-of-us/westward', 'url': 'https://www.historyvault.com/collections/america-the-story-of-us/westward',
'only_matching': True 'only_matching': True
}, {
'url': 'https://www.aetv.com/specials/hunting-jonbenets-killer-the-untold-story/preview-hunting-jonbenets-killer-the-untold-story',
'only_matching': True
}] }]
_DOMAIN_TO_REQUESTOR_ID = { _DOMAIN_TO_REQUESTOR_ID = {
'history.com': 'HISTORY', 'history.com': 'HISTORY',
@ -124,11 +168,6 @@ class AENetworksIE(AENetworksBaseIE):
return self.playlist_result( return self.playlist_result(
entries, self._html_search_meta('aetn:SeasonId', webpage)) entries, self._html_search_meta('aetn:SeasonId', webpage))
query = {
'mbr': 'true',
'assetTypes': 'high_video_ak',
'switch': 'hls_high_ak',
}
video_id = self._html_search_meta('aetn:VideoID', webpage) video_id = self._html_search_meta('aetn:VideoID', webpage)
media_url = self._search_regex( media_url = self._search_regex(
[r"media_url\s*=\s*'(?P<url>[^']+)'", [r"media_url\s*=\s*'(?P<url>[^']+)'",
@ -138,64 +177,39 @@ class AENetworksIE(AENetworksBaseIE):
theplatform_metadata = self._download_theplatform_metadata(self._search_regex( theplatform_metadata = self._download_theplatform_metadata(self._search_regex(
r'https?://link\.theplatform\.com/s/([^?]+)', media_url, 'theplatform_path'), video_id) r'https?://link\.theplatform\.com/s/([^?]+)', media_url, 'theplatform_path'), video_id)
info = self._parse_theplatform_metadata(theplatform_metadata) info = self._parse_theplatform_metadata(theplatform_metadata)
auth = None
if theplatform_metadata.get('AETN$isBehindWall'): if theplatform_metadata.get('AETN$isBehindWall'):
requestor_id = self._DOMAIN_TO_REQUESTOR_ID[domain] requestor_id = self._DOMAIN_TO_REQUESTOR_ID[domain]
resource = self._get_mvpd_resource( resource = self._get_mvpd_resource(
requestor_id, theplatform_metadata['title'], requestor_id, theplatform_metadata['title'],
theplatform_metadata.get('AETN$PPL_pplProgramId') or theplatform_metadata.get('AETN$PPL_pplProgramId_OLD'), theplatform_metadata.get('AETN$PPL_pplProgramId') or theplatform_metadata.get('AETN$PPL_pplProgramId_OLD'),
theplatform_metadata['ratings'][0]['rating']) theplatform_metadata['ratings'][0]['rating'])
query['auth'] = self._extract_mvpd_auth( auth = self._extract_mvpd_auth(
url, video_id, requestor_id, resource) url, video_id, requestor_id, resource)
info.update(self._search_json_ld(webpage, video_id, fatal=False)) info.update(self._search_json_ld(webpage, video_id, fatal=False))
media_url = update_url_query(media_url, query) info.update(self._extract_aen_smil(media_url, video_id, auth))
media_url = self._sign_url(media_url, self._THEPLATFORM_KEY, self._THEPLATFORM_SECRET)
formats, subtitles = self._extract_theplatform_smil(media_url, video_id)
self._sort_formats(formats)
info.update({
'id': video_id,
'formats': formats,
'subtitles': subtitles,
})
return info return info
class HistoryTopicIE(AENetworksBaseIE): class HistoryTopicIE(AENetworksBaseIE):
IE_NAME = 'history:topic' IE_NAME = 'history:topic'
IE_DESC = 'History.com Topic' IE_DESC = 'History.com Topic'
_VALID_URL = r'https?://(?:www\.)?history\.com/topics/(?:[^/]+/)?(?P<topic_id>[^/]+)(?:/[^/]+(?:/(?P<video_display_id>[^/?#]+))?)?' _VALID_URL = r'https?://(?:www\.)?history\.com/topics/[^/]+/(?P<id>[\w+-]+?)-video'
_TESTS = [{ _TESTS = [{
'url': 'http://www.history.com/topics/valentines-day/history-of-valentines-day/videos/bet-you-didnt-know-valentines-day?m=528e394da93ae&s=undefined&f=1&free=false', 'url': 'https://www.history.com/topics/valentines-day/history-of-valentines-day-video',
'info_dict': { 'info_dict': {
'id': '40700995724', 'id': '40700995724',
'ext': 'mp4', 'ext': 'mp4',
'title': "Bet You Didn't Know: Valentine's Day", 'title': "History of Valentines Day",
'description': 'md5:7b57ea4829b391995b405fa60bd7b5f7', 'description': 'md5:7b57ea4829b391995b405fa60bd7b5f7',
'timestamp': 1375819729, 'timestamp': 1375819729,
'upload_date': '20130806', 'upload_date': '20130806',
'uploader': 'AENE-NEW',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
}, },
'add_ie': ['ThePlatform'], 'add_ie': ['ThePlatform'],
}, {
'url': 'http://www.history.com/topics/world-war-i/world-war-i-history/videos',
'info_dict':
{
'id': 'world-war-i-history',
'title': 'World War I History',
},
'playlist_mincount': 23,
}, {
'url': 'http://www.history.com/topics/world-war-i-history/videos',
'only_matching': True,
}, {
'url': 'http://www.history.com/topics/world-war-i/world-war-i-history',
'only_matching': True,
}, {
'url': 'http://www.history.com/topics/world-war-i/world-war-i-history/speeches',
'only_matching': True,
}] }]
def theplatform_url_result(self, theplatform_url, video_id, query): def theplatform_url_result(self, theplatform_url, video_id, query):
@ -215,27 +229,19 @@ class HistoryTopicIE(AENetworksBaseIE):
} }
def _real_extract(self, url): def _real_extract(self, url):
topic_id, video_display_id = re.match(self._VALID_URL, url).groups() display_id = self._match_id(url)
if video_display_id: webpage = self._download_webpage(url, display_id)
webpage = self._download_webpage(url, video_display_id) video_id = self._search_regex(
release_url, video_id = re.search(r"_videoPlayer.play\('([^']+)'\s*,\s*'[^']+'\s*,\s*'(\d+)'\)", webpage).groups() r'<phoenix-iframe[^>]+src="[^"]+\btpid=(\d+)', webpage, 'tpid')
release_url = unescapeHTML(release_url) result = self._download_json(
'https://feeds.video.aetnd.com/api/v2/history/videos',
return self.theplatform_url_result( video_id, query={'filter[id]': video_id})['results'][0]
release_url, video_id, { title = result['title']
'mbr': 'true', info = self._extract_aen_smil(result['publicUrl'], video_id)
'switch': 'hls', info.update({
'assetTypes': 'high_video_ak', 'title': title,
'description': result.get('description'),
'duration': int_or_none(result.get('duration')),
'timestamp': int_or_none(result.get('added'), 1000),
}) })
else: return info
webpage = self._download_webpage(url, topic_id)
entries = []
for episode_item in re.findall(r'<a.+?data-release-url="[^"]+"[^>]*>', webpage):
video_attributes = extract_attributes(episode_item)
entries.append(self.theplatform_url_result(
video_attributes['data-release-url'], video_attributes['data-id'], {
'mbr': 'true',
'switch': 'hls',
'assetTypes': 'high_video_ak',
}))
return self.playlist_result(entries, topic_id, get_element_by_attribute('class', 'show-title', webpage))

View File

@ -1,30 +0,0 @@
from __future__ import unicode_literals
from .nuevo import NuevoBaseIE
class AnitubeIE(NuevoBaseIE):
IE_NAME = 'anitube.se'
_VALID_URL = r'https?://(?:www\.)?anitube\.se/video/(?P<id>\d+)'
_TEST = {
'url': 'http://www.anitube.se/video/36621',
'md5': '59d0eeae28ea0bc8c05e7af429998d43',
'info_dict': {
'id': '36621',
'ext': 'mp4',
'title': 'Recorder to Randoseru 01',
'duration': 180.19,
},
'skip': 'Blocked in the US',
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
key = self._search_regex(
r'src=["\']https?://[^/]+/embed/([A-Za-z0-9_-]+)', webpage, 'key')
return self._extract_nuevo(
'http://www.anitube.se/nuevo/econfig.php?key=%s' % key, video_id)

View File

@ -1,61 +0,0 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
parse_duration,
int_or_none,
)
class AnySexIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?anysex\.com/(?P<id>\d+)'
_TEST = {
'url': 'http://anysex.com/156592/',
'md5': '023e9fbb7f7987f5529a394c34ad3d3d',
'info_dict': {
'id': '156592',
'ext': 'mp4',
'title': 'Busty and sexy blondie in her bikini strips for you',
'description': 'md5:de9e418178e2931c10b62966474e1383',
'categories': ['Erotic'],
'duration': 270,
'age_limit': 18,
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
webpage = self._download_webpage(url, video_id)
video_url = self._html_search_regex(r"video_url\s*:\s*'([^']+)'", webpage, 'video URL')
title = self._html_search_regex(r'<title>(.*?)</title>', webpage, 'title')
description = self._html_search_regex(
r'<div class="description"[^>]*>([^<]+)</div>', webpage, 'description', fatal=False)
thumbnail = self._html_search_regex(
r'preview_url\s*:\s*\'(.*?)\'', webpage, 'thumbnail', fatal=False)
categories = re.findall(
r'<a href="http://anysex\.com/categories/[^"]+" title="[^"]*">([^<]+)</a>', webpage)
duration = parse_duration(self._search_regex(
r'<b>Duration:</b> (?:<q itemprop="duration">)?(\d+:\d+)', webpage, 'duration', fatal=False))
view_count = int_or_none(self._html_search_regex(
r'<b>Views:</b> (\d+)', webpage, 'view count', fatal=False))
return {
'id': video_id,
'url': video_url,
'ext': 'mp4',
'title': title,
'description': description,
'thumbnail': thumbnail,
'categories': categories,
'duration': duration,
'view_count': view_count,
'age_limit': 18,
}

View File

@ -4,6 +4,10 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import (
compat_parse_qs,
compat_urllib_parse_urlparse,
)
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
int_or_none, int_or_none,
@ -12,12 +16,12 @@ from ..utils import (
class AolIE(InfoExtractor): class AolIE(InfoExtractor):
IE_NAME = 'on.aol.com' IE_NAME = 'aol.com'
_VALID_URL = r'(?:aol-video:|https?://(?:(?:www|on)\.)?aol\.com/(?:[^/]+/)*(?:[^/?#&]+-)?)(?P<id>[^/?#&]+)' _VALID_URL = r'(?:aol-video:|https?://(?:www\.)?aol\.(?:com|ca|co\.uk|de|jp)/video/(?:[^/]+/)*)(?P<id>[0-9a-f]+)'
_TESTS = [{ _TESTS = [{
# video with 5min ID # video with 5min ID
'url': 'http://on.aol.com/video/u-s--official-warns-of-largest-ever-irs-phone-scam-518167793?icid=OnHomepageC2Wide_MustSee_Img', 'url': 'https://www.aol.com/video/view/u-s--official-warns-of-largest-ever-irs-phone-scam/518167793/',
'md5': '18ef68f48740e86ae94b98da815eec42', 'md5': '18ef68f48740e86ae94b98da815eec42',
'info_dict': { 'info_dict': {
'id': '518167793', 'id': '518167793',
@ -34,7 +38,7 @@ class AolIE(InfoExtractor):
} }
}, { }, {
# video with vidible ID # video with vidible ID
'url': 'http://www.aol.com/video/view/netflix-is-raising-rates/5707d6b8e4b090497b04f706/', 'url': 'https://www.aol.com/video/view/netflix-is-raising-rates/5707d6b8e4b090497b04f706/',
'info_dict': { 'info_dict': {
'id': '5707d6b8e4b090497b04f706', 'id': '5707d6b8e4b090497b04f706',
'ext': 'mp4', 'ext': 'mp4',
@ -49,17 +53,29 @@ class AolIE(InfoExtractor):
'skip_download': True, 'skip_download': True,
} }
}, { }, {
'url': 'http://on.aol.com/partners/abc-551438d309eab105804dbfe8/sneak-peek-was-haley-really-framed-570eaebee4b0448640a5c944', 'url': 'https://www.aol.com/video/view/park-bench-season-2-trailer/559a1b9be4b0c3bfad3357a7/',
'only_matching': True, 'only_matching': True,
}, { }, {
'url': 'http://on.aol.com/shows/park-bench-shw518173474-559a1b9be4b0c3bfad3357a7?context=SH:SHW518173474:PL4327:1460619712763', 'url': 'https://www.aol.com/video/view/donald-trump-spokeswoman-tones-down-megyn-kelly-attacks/519442220/',
'only_matching': True,
}, {
'url': 'http://on.aol.com/video/519442220',
'only_matching': True, 'only_matching': True,
}, { }, {
'url': 'aol-video:5707d6b8e4b090497b04f706', 'url': 'aol-video:5707d6b8e4b090497b04f706',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.aol.com/video/playlist/PL8245/5ca79d19d21f1a04035db606/',
'only_matching': True,
}, {
'url': 'https://www.aol.ca/video/view/u-s-woman-s-family-arrested-for-murder-first-pinned-on-panhandler-police/5c7ccf45bc03931fa04b2fe1/',
'only_matching': True,
}, {
'url': 'https://www.aol.co.uk/video/view/-one-dead-and-22-hurt-in-bus-crash-/5cb3a6f3d21f1a072b457347/',
'only_matching': True,
}, {
'url': 'https://www.aol.de/video/view/eva-braun-privataufnahmen-von-hitlers-geliebter-werden-digitalisiert/5cb2d49de98ab54c113d3d5d/',
'only_matching': True,
}, {
'url': 'https://www.aol.jp/video/playlist/5a28e936a1334d000137da0c/5a28f3151e642219fde19831/',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@ -73,7 +89,7 @@ class AolIE(InfoExtractor):
video_data = response['data'] video_data = response['data']
formats = [] formats = []
m3u8_url = video_data.get('videoMasterPlaylist') m3u8_url = url_or_none(video_data.get('videoMasterPlaylist'))
if m3u8_url: if m3u8_url:
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', m3u8_id='hls', fatal=False)) m3u8_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
@ -96,6 +112,12 @@ class AolIE(InfoExtractor):
'width': int(mobj.group(1)), 'width': int(mobj.group(1)),
'height': int(mobj.group(2)), 'height': int(mobj.group(2)),
}) })
else:
qs = compat_parse_qs(compat_urllib_parse_urlparse(video_url).query)
f.update({
'width': int_or_none(qs.get('w', [None])[0]),
'height': int_or_none(qs.get('h', [None])[0]),
})
formats.append(f) formats.append(f)
self._sort_formats(formats, ('width', 'height', 'tbr', 'format_id')) self._sort_formats(formats, ('width', 'height', 'tbr', 'format_id'))

View File

@ -103,7 +103,7 @@ class ArkenaIE(InfoExtractor):
f_url, video_id, mpd_id=kind, fatal=False)) f_url, video_id, mpd_id=kind, fatal=False))
elif kind == 'silverlight': elif kind == 'silverlight':
# TODO: process when ism is supported (see # TODO: process when ism is supported (see
# https://github.com/rg3/youtube-dl/issues/8118) # https://github.com/ytdl-org/youtube-dl/issues/8118)
continue continue
else: else:
tbr = float_or_none(f.get('Bitrate'), 1000) tbr = float_or_none(f.get('Bitrate'), 1000)

View File

@ -4,17 +4,10 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import ( from ..compat import compat_str
compat_parse_qs,
compat_str,
compat_urllib_parse_urlparse,
)
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
find_xpath_attr,
get_element_by_attribute,
int_or_none, int_or_none,
NO_DEFAULT,
qualities, qualities,
try_get, try_get,
unified_strdate, unified_strdate,
@ -25,59 +18,7 @@ from ..utils import (
# add tests. # add tests.
class ArteTvIE(InfoExtractor):
_VALID_URL = r'https?://videos\.arte\.tv/(?P<lang>fr|de|en|es)/.*-(?P<id>.*?)\.html'
IE_NAME = 'arte.tv'
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
lang = mobj.group('lang')
video_id = mobj.group('id')
ref_xml_url = url.replace('/videos/', '/do_delegate/videos/')
ref_xml_url = ref_xml_url.replace('.html', ',view,asPlayerXml.xml')
ref_xml_doc = self._download_xml(
ref_xml_url, video_id, note='Downloading metadata')
config_node = find_xpath_attr(ref_xml_doc, './/video', 'lang', lang)
config_xml_url = config_node.attrib['ref']
config = self._download_xml(
config_xml_url, video_id, note='Downloading configuration')
formats = [{
'format_id': q.attrib['quality'],
# The playpath starts at 'mp4:', if we don't manually
# split the url, rtmpdump will incorrectly parse them
'url': q.text.split('mp4:', 1)[0],
'play_path': 'mp4:' + q.text.split('mp4:', 1)[1],
'ext': 'flv',
'quality': 2 if q.attrib['quality'] == 'hd' else 1,
} for q in config.findall('./urls/url')]
self._sort_formats(formats)
title = config.find('.//name').text
thumbnail = config.find('.//firstThumbnailUrl').text
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'formats': formats,
}
class ArteTVBaseIE(InfoExtractor): class ArteTVBaseIE(InfoExtractor):
@classmethod
def _extract_url_info(cls, url):
mobj = re.match(cls._VALID_URL, url)
lang = mobj.group('lang')
query = compat_parse_qs(compat_urllib_parse_urlparse(url).query)
if 'vid' in query:
video_id = query['vid'][0]
else:
# This is not a real id, it can be for example AJT for the news
# http://www.arte.tv/guide/fr/emissions/AJT/arte-journal
video_id = mobj.group('id')
return video_id, lang
def _extract_from_json_url(self, json_url, video_id, lang, title=None): def _extract_from_json_url(self, json_url, video_id, lang, title=None):
info = self._download_json(json_url, video_id) info = self._download_json(json_url, video_id)
player_info = info['videoJsonPlayer'] player_info = info['videoJsonPlayer']
@ -108,13 +49,15 @@ class ArteTVBaseIE(InfoExtractor):
'upload_date': unified_strdate(upload_date_str), 'upload_date': unified_strdate(upload_date_str),
'thumbnail': player_info.get('programImage') or player_info.get('VTU', {}).get('IUR'), 'thumbnail': player_info.get('programImage') or player_info.get('VTU', {}).get('IUR'),
} }
qfunc = qualities(['HQ', 'MQ', 'EQ', 'SQ']) qfunc = qualities(['MQ', 'HQ', 'EQ', 'SQ'])
LANGS = { LANGS = {
'fr': 'F', 'fr': 'F',
'de': 'A', 'de': 'A',
'en': 'E[ANG]', 'en': 'E[ANG]',
'es': 'E[ESP]', 'es': 'E[ESP]',
'it': 'E[ITA]',
'pl': 'E[POL]',
} }
langcode = LANGS.get(lang, lang) langcode = LANGS.get(lang, lang)
@ -126,8 +69,8 @@ class ArteTVBaseIE(InfoExtractor):
l = re.escape(langcode) l = re.escape(langcode)
# Language preference from most to least priority # Language preference from most to least priority
# Reference: section 5.6.3 of # Reference: section 6.8 of
# http://www.arte.tv/sites/en/corporate/files/complete-technical-guidelines-arte-geie-v1-05.pdf # https://www.arte.tv/sites/en/corporate/files/complete-technical-guidelines-arte-geie-v1-07-1.pdf
PREFERENCES = ( PREFERENCES = (
# original version in requested language, without subtitles # original version in requested language, without subtitles
r'VO{0}$'.format(l), r'VO{0}$'.format(l),
@ -193,274 +136,59 @@ class ArteTVBaseIE(InfoExtractor):
class ArteTVPlus7IE(ArteTVBaseIE): class ArteTVPlus7IE(ArteTVBaseIE):
IE_NAME = 'arte.tv:+7' IE_NAME = 'arte.tv:+7'
_VALID_URL = r'https?://(?:(?:www|sites)\.)?arte\.tv/(?:[^/]+/)?(?P<lang>fr|de|en|es)/(?:videos/)?(?:[^/]+/)*(?P<id>[^/?#&]+)' _VALID_URL = r'https?://(?:www\.)?arte\.tv/(?P<lang>fr|de|en|es|it|pl)/videos/(?P<id>\d{6}-\d{3}-[AF])'
_TESTS = [{ _TESTS = [{
'url': 'http://www.arte.tv/guide/de/sendungen/XEN/xenius/?vid=055918-015_PLUS7-D', 'url': 'https://www.arte.tv/en/videos/088501-000-A/mexico-stealing-petrol-to-survive/',
'only_matching': True, 'info_dict': {
}, { 'id': '088501-000-A',
'url': 'http://sites.arte.tv/karambolage/de/video/karambolage-22', 'ext': 'mp4',
'only_matching': True, 'title': 'Mexico: Stealing Petrol to Survive',
}, { 'upload_date': '20190628',
'url': 'http://www.arte.tv/de/videos/048696-000-A/der-kluge-bauch-unser-zweites-gehirn', },
'only_matching': True,
}] }]
@classmethod
def suitable(cls, url):
return False if ArteTVPlaylistIE.suitable(url) else super(ArteTVPlus7IE, cls).suitable(url)
def _real_extract(self, url): def _real_extract(self, url):
video_id, lang = self._extract_url_info(url) lang, video_id = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(url, video_id) return self._extract_from_json_url(
return self._extract_from_webpage(webpage, video_id, lang) 'https://api.arte.tv/api/player/v1/config/%s/%s' % (lang, video_id),
video_id, lang)
def _extract_from_webpage(self, webpage, video_id, lang):
patterns_templates = (r'arte_vp_url=["\'](.*?%s.*?)["\']', r'data-url=["\']([^"]+%s[^"]+)["\']')
ids = (video_id, '')
# some pages contain multiple videos (like
# http://www.arte.tv/guide/de/sendungen/XEN/xenius/?vid=055918-015_PLUS7-D),
# so we first try to look for json URLs that contain the video id from
# the 'vid' parameter.
patterns = [t % re.escape(_id) for _id in ids for t in patterns_templates]
json_url = self._html_search_regex(
patterns, webpage, 'json vp url', default=None)
if not json_url:
def find_iframe_url(webpage, default=NO_DEFAULT):
return self._html_search_regex(
r'<iframe[^>]+src=(["\'])(?P<url>.+\bjson_url=.+?)\1',
webpage, 'iframe url', group='url', default=default)
iframe_url = find_iframe_url(webpage, None)
if not iframe_url:
embed_url = self._html_search_regex(
r'arte_vp_url_oembed=\'([^\']+?)\'', webpage, 'embed url', default=None)
if embed_url:
player = self._download_json(
embed_url, video_id, 'Downloading player page')
iframe_url = find_iframe_url(player['html'])
# en and es URLs produce react-based pages with different layout (e.g.
# http://www.arte.tv/guide/en/053330-002-A/carnival-italy?zone=world)
if not iframe_url:
program = self._search_regex(
r'program\s*:\s*({.+?["\']embed_html["\'].+?}),?\s*\n',
webpage, 'program', default=None)
if program:
embed_html = self._parse_json(program, video_id)
if embed_html:
iframe_url = find_iframe_url(embed_html['embed_html'])
if iframe_url:
json_url = compat_parse_qs(
compat_urllib_parse_urlparse(iframe_url).query)['json_url'][0]
if json_url:
title = self._search_regex(
r'<h3[^>]+title=(["\'])(?P<title>.+?)\1',
webpage, 'title', default=None, group='title')
return self._extract_from_json_url(json_url, video_id, lang, title=title)
# Different kind of embed URL (e.g.
# http://www.arte.tv/magazine/trepalium/fr/episode-0406-replay-trepalium)
entries = [
self.url_result(url)
for _, url in re.findall(r'<iframe[^>]+src=(["\'])(?P<url>.+?)\1', webpage)]
return self.playlist_result(entries)
# It also uses the arte_vp_url url from the webpage to extract the information
class ArteTVCreativeIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:creative'
_VALID_URL = r'https?://creative\.arte\.tv/(?P<lang>fr|de|en|es)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://creative.arte.tv/fr/episode/osmosis-episode-1',
'info_dict': {
'id': '057405-001-A',
'ext': 'mp4',
'title': 'OSMOSIS - N\'AYEZ PLUS PEUR D\'AIMER (1)',
'upload_date': '20150716',
},
}, {
'url': 'http://creative.arte.tv/fr/Monty-Python-Reunion',
'playlist_count': 11,
'add_ie': ['Youtube'],
}, {
'url': 'http://creative.arte.tv/de/episode/agentur-amateur-4-der-erste-kunde',
'only_matching': True,
}]
class ArteTVInfoIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:info'
_VALID_URL = r'https?://info\.arte\.tv/(?P<lang>fr|de|en|es)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://info.arte.tv/fr/service-civique-un-cache-misere',
'info_dict': {
'id': '067528-000-A',
'ext': 'mp4',
'title': 'Service civique, un cache misère ?',
'upload_date': '20160403',
},
}]
class ArteTVFutureIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:future'
_VALID_URL = r'https?://future\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://future.arte.tv/fr/info-sciences/les-ecrevisses-aussi-sont-anxieuses',
'info_dict': {
'id': '050940-028-A',
'ext': 'mp4',
'title': 'Les écrevisses aussi peuvent être anxieuses',
'upload_date': '20140902',
},
}, {
'url': 'http://future.arte.tv/fr/la-science-est-elle-responsable',
'only_matching': True,
}]
class ArteTVDDCIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:ddc'
_VALID_URL = r'https?://ddc\.arte\.tv/(?P<lang>emission|folge)/(?P<id>[^/?#&]+)'
_TESTS = []
def _real_extract(self, url):
video_id, lang = self._extract_url_info(url)
if lang == 'folge':
lang = 'de'
elif lang == 'emission':
lang = 'fr'
webpage = self._download_webpage(url, video_id)
scriptElement = get_element_by_attribute('class', 'visu_video_block', webpage)
script_url = self._html_search_regex(r'src="(.*?)"', scriptElement, 'script url')
javascriptPlayerGenerator = self._download_webpage(script_url, video_id, 'Download javascript player generator')
json_url = self._search_regex(r"json_url=(.*)&rendering_place.*", javascriptPlayerGenerator, 'json url')
return self._extract_from_json_url(json_url, video_id, lang)
class ArteTVConcertIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:concert'
_VALID_URL = r'https?://concert\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://concert.arte.tv/de/notwist-im-pariser-konzertclub-divan-du-monde',
'md5': '9ea035b7bd69696b67aa2ccaaa218161',
'info_dict': {
'id': '186',
'ext': 'mp4',
'title': 'The Notwist im Pariser Konzertclub "Divan du Monde"',
'upload_date': '20140128',
'description': 'md5:486eb08f991552ade77439fe6d82c305',
},
}]
class ArteTVCinemaIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:cinema'
_VALID_URL = r'https?://cinema\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>.+)'
_TESTS = [{
'url': 'http://cinema.arte.tv/fr/article/les-ailes-du-desir-de-julia-reck',
'md5': 'a5b9dd5575a11d93daf0e3f404f45438',
'info_dict': {
'id': '062494-000-A',
'ext': 'mp4',
'title': 'Film lauréat du concours web - "Les ailes du désir" de Julia Reck',
'upload_date': '20150807',
},
}]
class ArteTVMagazineIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:magazine'
_VALID_URL = r'https?://(?:www\.)?arte\.tv/magazine/[^/]+/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TESTS = [{
# Embedded via <iframe src="http://www.arte.tv/arte_vp/index.php?json_url=..."
'url': 'http://www.arte.tv/magazine/trepalium/fr/entretien-avec-le-realisateur-vincent-lannoo-trepalium',
'md5': '2a9369bcccf847d1c741e51416299f25',
'info_dict': {
'id': '065965-000-A',
'ext': 'mp4',
'title': 'Trepalium - Extrait Ep.01',
'upload_date': '20160121',
},
}, {
# Embedded via <iframe src="http://www.arte.tv/guide/fr/embed/054813-004-A/medium"
'url': 'http://www.arte.tv/magazine/trepalium/fr/episode-0406-replay-trepalium',
'md5': 'fedc64fc7a946110fe311634e79782ca',
'info_dict': {
'id': '054813-004_PLUS7-F',
'ext': 'mp4',
'title': 'Trepalium (4/6)',
'description': 'md5:10057003c34d54e95350be4f9b05cb40',
'upload_date': '20160218',
},
}, {
'url': 'http://www.arte.tv/magazine/metropolis/de/frank-woeste-german-paris-metropolis',
'only_matching': True,
}]
class ArteTVEmbedIE(ArteTVPlus7IE): class ArteTVEmbedIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:embed' IE_NAME = 'arte.tv:embed'
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
http://www\.arte\.tv https://www\.arte\.tv
/(?:playerv2/embed|arte_vp/index)\.php\?json_url= /player/v3/index\.php\?json_url=
(?P<json_url> (?P<json_url>
http://arte\.tv/papi/tvguide/videos/stream/player/ https?://api\.arte\.tv/api/player/v1/config/
(?P<lang>[^/]+)/(?P<id>[^/]+)[^&]* (?P<lang>[^/]+)/(?P<id>\d{6}-\d{3}-[AF])
) )
''' '''
_TESTS = [] _TESTS = []
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) json_url, lang, video_id = re.match(self._VALID_URL, url).groups()
video_id = mobj.group('id')
lang = mobj.group('lang')
json_url = mobj.group('json_url')
return self._extract_from_json_url(json_url, video_id, lang) return self._extract_from_json_url(json_url, video_id, lang)
class TheOperaPlatformIE(ArteTVPlus7IE):
IE_NAME = 'theoperaplatform'
_VALID_URL = r'https?://(?:www\.)?theoperaplatform\.eu/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.theoperaplatform.eu/de/opera/verdi-otello',
'md5': '970655901fa2e82e04c00b955e9afe7b',
'info_dict': {
'id': '060338-009-A',
'ext': 'mp4',
'title': 'Verdi - OTELLO',
'upload_date': '20160927',
},
}]
class ArteTVPlaylistIE(ArteTVBaseIE): class ArteTVPlaylistIE(ArteTVBaseIE):
IE_NAME = 'arte.tv:playlist' IE_NAME = 'arte.tv:playlist'
_VALID_URL = r'https?://(?:www\.)?arte\.tv/guide/(?P<lang>fr|de|en|es)/[^#]*#collection/(?P<id>PL-\d+)' _VALID_URL = r'https?://(?:www\.)?arte\.tv/(?P<lang>fr|de|en|es|it|pl)/videos/(?P<id>RC-\d{6})'
_TESTS = [{ _TESTS = [{
'url': 'http://www.arte.tv/guide/de/plus7/?country=DE#collection/PL-013263/ARTETV', 'url': 'https://www.arte.tv/en/videos/RC-016954/earn-a-living/',
'info_dict': { 'info_dict': {
'id': 'PL-013263', 'id': 'RC-016954',
'title': 'Areva & Uramin', 'title': 'Earn a Living',
'description': 'md5:a1dc0312ce357c262259139cfd48c9bf', 'description': 'md5:d322c55011514b3a7241f7fb80d494c2',
}, },
'playlist_mincount': 6, 'playlist_mincount': 6,
}, {
'url': 'http://www.arte.tv/guide/de/playlists?country=DE#collection/PL-013190/ARTETV',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
playlist_id, lang = self._extract_url_info(url) lang, playlist_id = re.match(self._VALID_URL, url).groups()
collection = self._download_json( collection = self._download_json(
'https://api.arte.tv/api/player/v1/collectionData/%s/%s?source=videos' 'https://api.arte.tv/api/player/v1/collectionData/%s/%s?source=videos'
% (lang, playlist_id), playlist_id) % (lang, playlist_id), playlist_id)

View File

@ -5,14 +5,12 @@ import re
from .common import InfoExtractor from .common import InfoExtractor
from .kaltura import KalturaIE from .kaltura import KalturaIE
from ..utils import ( from ..utils import extract_attributes
extract_attributes,
remove_end,
)
class AsianCrushIE(InfoExtractor): class AsianCrushIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?asiancrush\.com/video/(?:[^/]+/)?0+(?P<id>\d+)v\b' _VALID_URL_BASE = r'https?://(?:www\.)?(?P<host>(?:(?:asiancrush|yuyutv|midnightpulp)\.com|cocoro\.tv))'
_VALID_URL = r'%s/video/(?:[^/]+/)?0+(?P<id>\d+)v\b' % _VALID_URL_BASE
_TESTS = [{ _TESTS = [{
'url': 'https://www.asiancrush.com/video/012869v/women-who-flirt/', 'url': 'https://www.asiancrush.com/video/012869v/women-who-flirt/',
'md5': 'c3b740e48d0ba002a42c0b72857beae6', 'md5': 'c3b740e48d0ba002a42c0b72857beae6',
@ -20,7 +18,7 @@ class AsianCrushIE(InfoExtractor):
'id': '1_y4tmjm5r', 'id': '1_y4tmjm5r',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Women Who Flirt', 'title': 'Women Who Flirt',
'description': 'md5:3db14e9186197857e7063522cb89a805', 'description': 'md5:7e986615808bcfb11756eb503a751487',
'timestamp': 1496936429, 'timestamp': 1496936429,
'upload_date': '20170608', 'upload_date': '20170608',
'uploader_id': 'craig@crifkin.com', 'uploader_id': 'craig@crifkin.com',
@ -28,10 +26,27 @@ class AsianCrushIE(InfoExtractor):
}, { }, {
'url': 'https://www.asiancrush.com/video/she-was-pretty/011886v-pretty-episode-3/', 'url': 'https://www.asiancrush.com/video/she-was-pretty/011886v-pretty-episode-3/',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.yuyutv.com/video/013886v/the-act-of-killing/',
'only_matching': True,
}, {
'url': 'https://www.yuyutv.com/video/peep-show/013922v-warring-factions/',
'only_matching': True,
}, {
'url': 'https://www.midnightpulp.com/video/010400v/drifters/',
'only_matching': True,
}, {
'url': 'https://www.midnightpulp.com/video/mononoke/016378v-zashikiwarashi-part-1/',
'only_matching': True,
}, {
'url': 'https://www.cocoro.tv/video/the-wonderful-wizard-of-oz/008878v-the-wonderful-wizard-of-oz-ep01/',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) mobj = re.match(self._VALID_URL, url)
host = mobj.group('host')
video_id = mobj.group('id')
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
@ -51,7 +66,7 @@ class AsianCrushIE(InfoExtractor):
r'\bentry_id["\']\s*:\s*["\'](\d+)', webpage, 'entry id') r'\bentry_id["\']\s*:\s*["\'](\d+)', webpage, 'entry id')
player = self._download_webpage( player = self._download_webpage(
'https://api.asiancrush.com/embeddedVideoPlayer', video_id, 'https://api.%s/embeddedVideoPlayer' % host, video_id,
query={'id': entry_id}) query={'id': entry_id})
kaltura_id = self._search_regex( kaltura_id = self._search_regex(
@ -63,15 +78,23 @@ class AsianCrushIE(InfoExtractor):
r'/p(?:artner_id)?/(\d+)', player, 'partner id', r'/p(?:artner_id)?/(\d+)', player, 'partner id',
default='513551') default='513551')
return self.url_result( description = self._html_search_regex(
'kaltura:%s:%s' % (partner_id, kaltura_id), r'(?s)<div[^>]+\bclass=["\']description["\'][^>]*>(.+?)</div>',
ie=KalturaIE.ie_key(), video_id=kaltura_id, webpage, 'description', fatal=False)
video_title=title)
return {
'_type': 'url_transparent',
'url': 'kaltura:%s:%s' % (partner_id, kaltura_id),
'ie_key': KalturaIE.ie_key(),
'id': video_id,
'title': title,
'description': description,
}
class AsianCrushPlaylistIE(InfoExtractor): class AsianCrushPlaylistIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?asiancrush\.com/series/0+(?P<id>\d+)s\b' _VALID_URL = r'%s/series/0+(?P<id>\d+)s\b' % AsianCrushIE._VALID_URL_BASE
_TEST = { _TESTS = [{
'url': 'https://www.asiancrush.com/series/012481s/scholar-walks-night/', 'url': 'https://www.asiancrush.com/series/012481s/scholar-walks-night/',
'info_dict': { 'info_dict': {
'id': '12481', 'id': '12481',
@ -79,7 +102,16 @@ class AsianCrushPlaylistIE(InfoExtractor):
'description': 'md5:7addd7c5132a09fd4741152d96cce886', 'description': 'md5:7addd7c5132a09fd4741152d96cce886',
}, },
'playlist_count': 20, 'playlist_count': 20,
} }, {
'url': 'https://www.yuyutv.com/series/013920s/peep-show/',
'only_matching': True,
}, {
'url': 'https://www.midnightpulp.com/series/016375s/mononoke/',
'only_matching': True,
}, {
'url': 'https://www.cocoro.tv/series/008549s/the-wonderful-wizard-of-oz/',
'only_matching': True,
}]
def _real_extract(self, url): def _real_extract(self, url):
playlist_id = self._match_id(url) playlist_id = self._match_id(url)
@ -96,15 +128,15 @@ class AsianCrushPlaylistIE(InfoExtractor):
entries.append(self.url_result( entries.append(self.url_result(
mobj.group('url'), ie=AsianCrushIE.ie_key())) mobj.group('url'), ie=AsianCrushIE.ie_key()))
title = remove_end( title = self._html_search_regex(
self._html_search_regex(
r'(?s)<h1\b[^>]\bid=["\']movieTitle[^>]+>(.+?)</h1>', webpage, r'(?s)<h1\b[^>]\bid=["\']movieTitle[^>]+>(.+?)</h1>', webpage,
'title', default=None) or self._og_search_title( 'title', default=None) or self._og_search_title(
webpage, default=None) or self._html_search_meta( webpage, default=None) or self._html_search_meta(
'twitter:title', webpage, 'title', 'twitter:title', webpage, 'title',
default=None) or self._search_regex( default=None) or self._search_regex(
r'<title>([^<]+)</title>', webpage, 'title', fatal=False), r'<title>([^<]+)</title>', webpage, 'title', fatal=False)
' | AsianCrush') if title:
title = re.sub(r'\s*\|\s*.+?$', '', title)
description = self._og_search_description( description = self._og_search_description(
webpage, default=None) or self._html_search_meta( webpage, default=None) or self._html_search_meta(

View File

@ -1,202 +1,118 @@
# coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import time
import hmac
import hashlib
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str from ..compat import compat_HTTPError
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
float_or_none,
int_or_none, int_or_none,
sanitized_Request,
urlencode_postdata, urlencode_postdata,
xpath_text,
) )
class AtresPlayerIE(InfoExtractor): class AtresPlayerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/television/[^/]+/[^/]+/[^/]+/(?P<id>.+?)_\d+\.html' _VALID_URL = r'https?://(?:www\.)?atresplayer\.com/[^/]+/[^/]+/[^/]+/[^/]+/(?P<display_id>.+?)_(?P<id>[0-9a-f]{24})'
_NETRC_MACHINE = 'atresplayer' _NETRC_MACHINE = 'atresplayer'
_TESTS = [ _TESTS = [
{ {
'url': 'http://www.atresplayer.com/television/programas/el-club-de-la-comedia/temporada-4/capitulo-10-especial-solidario-nochebuena_2014122100174.html', 'url': 'https://www.atresplayer.com/antena3/series/pequenas-coincidencias/temporada-1/capitulo-7-asuntos-pendientes_5d4aa2c57ed1a88fc715a615/',
'md5': 'efd56753cda1bb64df52a3074f62e38a',
'info_dict': { 'info_dict': {
'id': 'capitulo-10-especial-solidario-nochebuena', 'id': '5d4aa2c57ed1a88fc715a615',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Especial Solidario de Nochebuena', 'title': 'Capítulo 7: Asuntos pendientes',
'description': 'md5:e2d52ff12214fa937107d21064075bf1', 'description': 'md5:7634cdcb4d50d5381bedf93efb537fbc',
'duration': 5527.6, 'duration': 3413,
'thumbnail': r're:^https?://.*\.jpg$', },
'params': {
'format': 'bestvideo',
}, },
'skip': 'This video is only available for registered users' 'skip': 'This video is only available for registered users'
}, },
{ {
'url': 'http://www.atresplayer.com/television/especial/videoencuentros/temporada-1/capitulo-112-david-bustamante_2014121600375.html', 'url': 'https://www.atresplayer.com/lasexta/programas/el-club-de-la-comedia/temporada-4/capitulo-10-especial-solidario-nochebuena_5ad08edf986b2855ed47adc4/',
'md5': '6e52cbb513c405e403dbacb7aacf8747', 'only_matching': True,
'info_dict': {
'id': 'capitulo-112-david-bustamante',
'ext': 'flv',
'title': 'David Bustamante',
'description': 'md5:f33f1c0a05be57f6708d4dd83a3b81c6',
'duration': 1439.0,
'thumbnail': r're:^https?://.*\.jpg$',
},
}, },
{ {
'url': 'http://www.atresplayer.com/television/series/el-secreto-de-puente-viejo/el-chico-de-los-tres-lunares/capitulo-977-29-12-14_2014122400174.html', 'url': 'https://www.atresplayer.com/antena3/series/el-secreto-de-puente-viejo/el-chico-de-los-tres-lunares/capitulo-977-29-12-14_5ad51046986b2886722ccdea/',
'only_matching': True, 'only_matching': True,
}, },
] ]
_API_BASE = 'https://api.atresplayer.com/'
_USER_AGENT = 'Dalvik/1.6.0 (Linux; U; Android 4.3; GT-I9300 Build/JSS15J'
_MAGIC = 'QWtMLXs414Yo+c#_+Q#K@NN)'
_TIMESTAMP_SHIFT = 30000
_TIME_API_URL = 'http://servicios.atresplayer.com/api/admin/time.json'
_URL_VIDEO_TEMPLATE = 'https://servicios.atresplayer.com/api/urlVideo/{1}/{0}/{1}|{2}|{3}.json'
_PLAYER_URL_TEMPLATE = 'https://servicios.atresplayer.com/episode/getplayer.json?episodePk=%s'
_EPISODE_URL_TEMPLATE = 'http://www.atresplayer.com/episodexml/%s'
_LOGIN_URL = 'https://servicios.atresplayer.com/j_spring_security_check'
_ERRORS = {
'UNPUBLISHED': 'We\'re sorry, but this video is not yet available.',
'DELETED': 'This video has expired and is no longer available for online streaming.',
'GEOUNPUBLISHED': 'We\'re sorry, but this video is not available in your region due to right restrictions.',
# 'PREMIUM': 'PREMIUM',
}
def _real_initialize(self): def _real_initialize(self):
self._login() self._login()
def _handle_error(self, e, code):
if isinstance(e.cause, compat_HTTPError) and e.cause.code == code:
error = self._parse_json(e.cause.read(), None)
if error.get('error') == 'required_registered':
self.raise_login_required()
raise ExtractorError(error['error_description'], expected=True)
raise
def _login(self): def _login(self):
username, password = self._get_login_info() username, password = self._get_login_info()
if username is None: if username is None:
return return
login_form = { self._request_webpage(
'j_username': username, self._API_BASE + 'login', None, 'Downloading login page')
'j_password': password,
}
request = sanitized_Request( try:
self._LOGIN_URL, urlencode_postdata(login_form)) target_url = self._download_json(
request.add_header('Content-Type', 'application/x-www-form-urlencoded') 'https://account.atresmedia.com/api/login', None,
response = self._download_webpage( 'Logging in', headers={
request, None, 'Logging in') 'Content-Type': 'application/x-www-form-urlencoded'
}, data=urlencode_postdata({
'username': username,
'password': password,
}))['targetUrl']
except ExtractorError as e:
self._handle_error(e, 400)
error = self._html_search_regex( self._request_webpage(target_url, None, 'Following Target URL')
r'(?s)<ul[^>]+class="[^"]*\blist_error\b[^"]*">(.+?)</ul>',
response, 'error', default=None)
if error:
raise ExtractorError(
'Unable to login: %s' % error, expected=True)
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) display_id, video_id = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(url, video_id) try:
episode = self._download_json(
self._API_BASE + 'client/v1/player/episode/' + video_id, video_id)
except ExtractorError as e:
self._handle_error(e, 403)
episode_id = self._search_regex( title = episode['titulo']
r'episode="([^"]+)"', webpage, 'episode id')
request = sanitized_Request(
self._PLAYER_URL_TEMPLATE % episode_id,
headers={'User-Agent': self._USER_AGENT})
player = self._download_json(request, episode_id, 'Downloading player JSON')
episode_type = player.get('typeOfEpisode')
error_message = self._ERRORS.get(episode_type)
if error_message:
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, error_message), expected=True)
formats = [] formats = []
video_url = player.get('urlVideo') for source in episode.get('sources', []):
if video_url: src = source.get('src')
format_info = { if not src:
'url': video_url,
'format_id': 'http',
}
mobj = re.search(r'(?P<bitrate>\d+)K_(?P<width>\d+)x(?P<height>\d+)', video_url)
if mobj:
format_info.update({
'width': int_or_none(mobj.group('width')),
'height': int_or_none(mobj.group('height')),
'tbr': int_or_none(mobj.group('bitrate')),
})
formats.append(format_info)
timestamp = int_or_none(self._download_webpage(
self._TIME_API_URL,
video_id, 'Downloading timestamp', fatal=False), 1000, time.time())
timestamp_shifted = compat_str(timestamp + self._TIMESTAMP_SHIFT)
token = hmac.new(
self._MAGIC.encode('ascii'),
(episode_id + timestamp_shifted).encode('utf-8'), hashlib.md5
).hexdigest()
request = sanitized_Request(
self._URL_VIDEO_TEMPLATE.format('windows', episode_id, timestamp_shifted, token),
headers={'User-Agent': self._USER_AGENT})
fmt_json = self._download_json(
request, video_id, 'Downloading windows video JSON')
result = fmt_json.get('resultDes')
if result.lower() != 'ok':
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, result), expected=True)
for format_id, video_url in fmt_json['resultObject'].items():
if format_id == 'token' or not video_url.startswith('http'):
continue continue
if 'geodeswowsmpra3player' in video_url: src_type = source.get('type')
# f4m_path = video_url.split('smil:', 1)[-1].split('free_', 1)[0] if src_type == 'application/vnd.apple.mpegurl':
# f4m_url = 'http://drg.antena3.com/{0}hds/es/sd.f4m'.format(f4m_path) formats.extend(self._extract_m3u8_formats(
# this videos are protected by DRM, the f4m downloader doesn't support them src, video_id, 'mp4', 'm3u8_native',
continue m3u8_id='hls', fatal=False))
video_url_hd = video_url.replace('free_es', 'es') elif src_type == 'application/dash+xml':
formats.extend(self._extract_f4m_formats(
video_url_hd[:-9] + '/manifest.f4m', video_id, f4m_id='hds',
fatal=False))
formats.extend(self._extract_mpd_formats( formats.extend(self._extract_mpd_formats(
video_url_hd[:-9] + '/manifest.mpd', video_id, mpd_id='dash', src, video_id, mpd_id='dash', fatal=False))
fatal=False))
self._sort_formats(formats) self._sort_formats(formats)
path_data = player.get('pathData') heartbeat = episode.get('heartbeat') or {}
omniture = episode.get('omniture') or {}
episode = self._download_xml( get_meta = lambda x: heartbeat.get(x) or omniture.get(x)
self._EPISODE_URL_TEMPLATE % path_data, video_id,
'Downloading episode XML')
duration = float_or_none(xpath_text(
episode, './media/asset/info/technical/contentDuration', 'duration'))
art = episode.find('./media/asset/info/art')
title = xpath_text(art, './name', 'title')
description = xpath_text(art, './description', 'description')
thumbnail = xpath_text(episode, './media/asset/files/background', 'thumbnail')
subtitles = {}
subtitle_url = xpath_text(episode, './media/asset/files/subtitle', 'subtitle')
if subtitle_url:
subtitles['es'] = [{
'ext': 'srt',
'url': subtitle_url,
}]
return { return {
'display_id': display_id,
'id': video_id, 'id': video_id,
'title': title, 'title': title,
'description': description, 'description': episode.get('descripcion'),
'thumbnail': thumbnail, 'thumbnail': episode.get('imgPoster'),
'duration': duration, 'duration': int_or_none(episode.get('duration')),
'formats': formats, 'formats': formats,
'subtitles': subtitles, 'channel': get_meta('channel'),
'season': get_meta('season'),
'episode_number': int_or_none(get_meta('episodeNumber')),
} }

View File

@ -2,22 +2,25 @@
from __future__ import unicode_literals from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import float_or_none from ..utils import (
clean_html,
float_or_none,
)
class AudioBoomIE(InfoExtractor): class AudioBoomIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?audioboom\.com/(?:boos|posts)/(?P<id>[0-9]+)' _VALID_URL = r'https?://(?:www\.)?audioboom\.com/(?:boos|posts)/(?P<id>[0-9]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://audioboom.com/boos/4279833-3-09-2016-czaban-hour-3?t=0', 'url': 'https://audioboom.com/posts/7398103-asim-chaudhry',
'md5': '63a8d73a055c6ed0f1e51921a10a5a76', 'md5': '7b00192e593ff227e6a315486979a42d',
'info_dict': { 'info_dict': {
'id': '4279833', 'id': '7398103',
'ext': 'mp3', 'ext': 'mp3',
'title': '3/09/2016 Czaban Hour 3', 'title': 'Asim Chaudhry',
'description': 'Guest: Nate Davis - NFL free agency, Guest: Stan Gans', 'description': 'md5:2f3fef17dacc2595b5362e1d7d3602fc',
'duration': 2245.72, 'duration': 4000.99,
'uploader': 'SB Nation A.M.', 'uploader': 'Sue Perkins: An hour or so with...',
'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channel/steveczabanyahoosportsradio', 'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channel/perkins',
} }
}, { }, {
'url': 'https://audioboom.com/posts/4279833-3-09-2016-czaban-hour-3?t=0', 'url': 'https://audioboom.com/posts/4279833-3-09-2016-czaban-hour-3?t=0',
@ -32,8 +35,8 @@ class AudioBoomIE(InfoExtractor):
clip = None clip = None
clip_store = self._parse_json( clip_store = self._parse_json(
self._search_regex( self._html_search_regex(
r'data-new-clip-store=(["\'])(?P<json>{.*?"clipId"\s*:\s*%s.*?})\1' % video_id, r'data-new-clip-store=(["\'])(?P<json>{.+?})\1',
webpage, 'clip store', default='{}', group='json'), webpage, 'clip store', default='{}', group='json'),
video_id, fatal=False) video_id, fatal=False)
if clip_store: if clip_store:
@ -47,14 +50,15 @@ class AudioBoomIE(InfoExtractor):
audio_url = from_clip('clipURLPriorToLoading') or self._og_search_property( audio_url = from_clip('clipURLPriorToLoading') or self._og_search_property(
'audio', webpage, 'audio url') 'audio', webpage, 'audio url')
title = from_clip('title') or self._og_search_title(webpage) title = from_clip('title') or self._html_search_meta(
description = from_clip('description') or self._og_search_description(webpage) ['og:title', 'og:audio:title', 'audio_title'], webpage)
description = from_clip('description') or clean_html(from_clip('formattedDescription')) or self._og_search_description(webpage)
duration = float_or_none(from_clip('duration') or self._html_search_meta( duration = float_or_none(from_clip('duration') or self._html_search_meta(
'weibo:audio:duration', webpage)) 'weibo:audio:duration', webpage))
uploader = from_clip('author') or self._og_search_property( uploader = from_clip('author') or self._html_search_meta(
'audio:artist', webpage, 'uploader', fatal=False) ['og:audio:artist', 'twitter:audio:artist_name', 'audio_artist'], webpage, 'uploader')
uploader_url = from_clip('author_url') or self._html_search_meta( uploader_url = from_clip('author_url') or self._html_search_meta(
'audioboo:channel', webpage, 'uploader url') 'audioboo:channel', webpage, 'uploader url')

View File

@ -1,142 +0,0 @@
from __future__ import unicode_literals
import re
import itertools
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
float_or_none,
int_or_none,
sanitized_Request,
urlencode_postdata,
)
class BambuserIE(InfoExtractor):
IE_NAME = 'bambuser'
_VALID_URL = r'https?://bambuser\.com/v/(?P<id>\d+)'
_API_KEY = '005f64509e19a868399060af746a00aa'
_LOGIN_URL = 'https://bambuser.com/user'
_NETRC_MACHINE = 'bambuser'
_TEST = {
'url': 'http://bambuser.com/v/4050584',
# MD5 seems to be flaky, see https://travis-ci.org/rg3/youtube-dl/jobs/14051016#L388
# 'md5': 'fba8f7693e48fd4e8641b3fd5539a641',
'info_dict': {
'id': '4050584',
'ext': 'flv',
'title': 'Education engineering days - lightning talks',
'duration': 3741,
'uploader': 'pixelversity',
'uploader_id': '344706',
'timestamp': 1382976692,
'upload_date': '20131028',
'view_count': int,
},
'params': {
# It doesn't respect the 'Range' header, it would download the whole video
# caused the travis builds to fail: https://travis-ci.org/rg3/youtube-dl/jobs/14493845#L59
'skip_download': True,
},
}
def _login(self):
username, password = self._get_login_info()
if username is None:
return
login_form = {
'form_id': 'user_login',
'op': 'Log in',
'name': username,
'pass': password,
}
request = sanitized_Request(
self._LOGIN_URL, urlencode_postdata(login_form))
request.add_header('Referer', self._LOGIN_URL)
response = self._download_webpage(
request, None, 'Logging in')
login_error = self._html_search_regex(
r'(?s)<div class="messages error">(.+?)</div>',
response, 'login error', default=None)
if login_error:
raise ExtractorError(
'Unable to login: %s' % login_error, expected=True)
def _real_initialize(self):
self._login()
def _real_extract(self, url):
video_id = self._match_id(url)
info = self._download_json(
'http://player-c.api.bambuser.com/getVideo.json?api_key=%s&vid=%s'
% (self._API_KEY, video_id), video_id)
error = info.get('error')
if error:
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, error), expected=True)
result = info['result']
return {
'id': video_id,
'title': result['title'],
'url': result['url'],
'thumbnail': result.get('preview'),
'duration': int_or_none(result.get('length')),
'uploader': result.get('username'),
'uploader_id': compat_str(result.get('owner', {}).get('uid')),
'timestamp': int_or_none(result.get('created')),
'fps': float_or_none(result.get('framerate')),
'view_count': int_or_none(result.get('views_total')),
'comment_count': int_or_none(result.get('comment_count')),
}
class BambuserChannelIE(InfoExtractor):
IE_NAME = 'bambuser:channel'
_VALID_URL = r'https?://bambuser\.com/channel/(?P<user>.*?)(?:/|#|\?|$)'
# The maximum number we can get with each request
_STEP = 50
_TEST = {
'url': 'http://bambuser.com/channel/pixelversity',
'info_dict': {
'title': 'pixelversity',
},
'playlist_mincount': 60,
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
user = mobj.group('user')
urls = []
last_id = ''
for i in itertools.count(1):
req_url = (
'http://bambuser.com/xhr-api/index.php?username={user}'
'&sort=created&access_mode=0%2C1%2C2&limit={count}'
'&method=broadcast&format=json&vid_older_than={last}'
).format(user=user, count=self._STEP, last=last_id)
req = sanitized_Request(req_url)
# Without setting this header, we wouldn't get any result
req.add_header('Referer', 'http://bambuser.com/channel/%s' % user)
data = self._download_json(
req, user, 'Downloading page %d' % i)
results = data['result']
if not results:
break
last_id = results[-1]['vid']
urls.extend(self.url_result(v['page'], 'Bambuser') for v in results)
return {
'_type': 'playlist',
'title': user,
'entries': urls,
}

View File

@ -1,8 +1,8 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re
import itertools import itertools
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
@ -17,10 +17,12 @@ from ..utils import (
parse_iso8601, parse_iso8601,
try_get, try_get,
unescapeHTML, unescapeHTML,
url_or_none,
urlencode_postdata, urlencode_postdata,
urljoin, urljoin,
) )
from ..compat import ( from ..compat import (
compat_etree_Element,
compat_HTTPError, compat_HTTPError,
compat_urlparse, compat_urlparse,
) )
@ -38,6 +40,7 @@ class BBCCoUkIE(InfoExtractor):
iplayer(?:/[^/]+)?/(?:episode/|playlist/)| iplayer(?:/[^/]+)?/(?:episode/|playlist/)|
music/(?:clips|audiovideo/popular)[/#]| music/(?:clips|audiovideo/popular)[/#]|
radio/player/| radio/player/|
sounds/play/|
events/[^/]+/play/[^/]+/ events/[^/]+/play/[^/]+/
) )
(?P<id>%s)(?!/(?:episodes|broadcasts|clips)) (?P<id>%s)(?!/(?:episodes|broadcasts|clips))
@ -68,7 +71,7 @@ class BBCCoUkIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': 'b039d07m', 'id': 'b039d07m',
'ext': 'flv', 'ext': 'flv',
'title': 'Leonard Cohen, Kaleidoscope - BBC Radio 4', 'title': 'Kaleidoscope, Leonard Cohen',
'description': 'The Canadian poet and songwriter reflects on his musical career.', 'description': 'The Canadian poet and songwriter reflects on his musical career.',
}, },
'params': { 'params': {
@ -206,7 +209,7 @@ class BBCCoUkIE(InfoExtractor):
}, },
'skip': 'Now it\'s really geo-restricted', 'skip': 'Now it\'s really geo-restricted',
}, { }, {
# compact player (https://github.com/rg3/youtube-dl/issues/8147) # compact player (https://github.com/ytdl-org/youtube-dl/issues/8147)
'url': 'http://www.bbc.co.uk/programmes/p028bfkf/player', 'url': 'http://www.bbc.co.uk/programmes/p028bfkf/player',
'info_dict': { 'info_dict': {
'id': 'p028bfkj', 'id': 'p028bfkj',
@ -218,6 +221,20 @@ class BBCCoUkIE(InfoExtractor):
# rtmp download # rtmp download
'skip_download': True, 'skip_download': True,
}, },
}, {
'url': 'https://www.bbc.co.uk/sounds/play/m0007jzb',
'note': 'Audio',
'info_dict': {
'id': 'm0007jz9',
'ext': 'mp4',
'title': 'BBC Proms, 2019, Prom 34: WestEastern Divan Orchestra',
'description': "Live BBC Proms. WestEastern Divan Orchestra with Daniel Barenboim and Martha Argerich.",
'duration': 9840,
},
'params': {
# rtmp download
'skip_download': True,
}
}, { }, {
'url': 'http://www.bbc.co.uk/iplayer/playlist/p01dvks4', 'url': 'http://www.bbc.co.uk/iplayer/playlist/p01dvks4',
'only_matching': True, 'only_matching': True,
@ -310,7 +327,13 @@ class BBCCoUkIE(InfoExtractor):
def _get_subtitles(self, media, programme_id): def _get_subtitles(self, media, programme_id):
subtitles = {} subtitles = {}
for connection in self._extract_connections(media): for connection in self._extract_connections(media):
captions = self._download_xml(connection.get('href'), programme_id, 'Downloading captions') cc_url = url_or_none(connection.get('href'))
if not cc_url:
continue
captions = self._download_xml(
cc_url, programme_id, 'Downloading captions', fatal=False)
if not isinstance(captions, compat_etree_Element):
continue
lang = captions.get('{http://www.w3.org/XML/1998/namespace}lang', 'en') lang = captions.get('{http://www.w3.org/XML/1998/namespace}lang', 'en')
subtitles[lang] = [ subtitles[lang] = [
{ {
@ -601,7 +624,7 @@ class BBCIE(BBCCoUkIE):
'url': 'http://www.bbc.com/news/world-europe-32668511', 'url': 'http://www.bbc.com/news/world-europe-32668511',
'info_dict': { 'info_dict': {
'id': 'world-europe-32668511', 'id': 'world-europe-32668511',
'title': 'Russia stages massive WW2 parade despite Western boycott', 'title': 'Russia stages massive WW2 parade',
'description': 'md5:00ff61976f6081841f759a08bf78cc9c', 'description': 'md5:00ff61976f6081841f759a08bf78cc9c',
}, },
'playlist_count': 2, 'playlist_count': 2,

View File

@ -99,8 +99,8 @@ class BeamProLiveIE(BeamProBaseIE):
class BeamProVodIE(BeamProBaseIE): class BeamProVodIE(BeamProBaseIE):
IE_NAME = 'Mixer:vod' IE_NAME = 'Mixer:vod'
_VALID_URL = r'https?://(?:\w+\.)?(?:beam\.pro|mixer\.com)/[^/?#&]+\?.*?\bvod=(?P<id>\d+)' _VALID_URL = r'https?://(?:\w+\.)?(?:beam\.pro|mixer\.com)/[^/?#&]+\?.*?\bvod=(?P<id>[^?#&]+)'
_TEST = { _TESTS = [{
'url': 'https://mixer.com/willow8714?vod=2259830', 'url': 'https://mixer.com/willow8714?vod=2259830',
'md5': 'b2431e6e8347dc92ebafb565d368b76b', 'md5': 'b2431e6e8347dc92ebafb565d368b76b',
'info_dict': { 'info_dict': {
@ -119,7 +119,13 @@ class BeamProVodIE(BeamProBaseIE):
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
} }, {
'url': 'https://mixer.com/streamer?vod=IxFno1rqC0S_XJ1a2yGgNw',
'only_matching': True,
}, {
'url': 'https://mixer.com/streamer?vod=Rh3LY0VAqkGpEQUe2pN-ig',
'only_matching': True,
}]
@staticmethod @staticmethod
def _extract_format(vod, vod_type): def _extract_format(vod, vod_type):

View File

@ -1,7 +1,10 @@
from __future__ import unicode_literals from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str from ..compat import (
compat_str,
compat_urlparse,
)
from ..utils import ( from ..utils import (
int_or_none, int_or_none,
unified_timestamp, unified_timestamp,
@ -9,8 +12,9 @@ from ..utils import (
class BeegIE(InfoExtractor): class BeegIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?beeg\.com/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?beeg\.(?:com|porn(?:/video)?)/(?P<id>\d+)'
_TEST = { _TESTS = [{
# api/v6 v1
'url': 'http://beeg.com/5416503', 'url': 'http://beeg.com/5416503',
'md5': 'a1a1b1a8bc70a89e49ccfd113aed0820', 'md5': 'a1a1b1a8bc70a89e49ccfd113aed0820',
'info_dict': { 'info_dict': {
@ -24,7 +28,21 @@ class BeegIE(InfoExtractor):
'tags': list, 'tags': list,
'age_limit': 18, 'age_limit': 18,
} }
} }, {
# api/v6 v2
'url': 'https://beeg.com/1941093077?t=911-1391',
'only_matching': True,
}, {
# api/v6 v2 w/o t
'url': 'https://beeg.com/1277207756',
'only_matching': True,
}, {
'url': 'https://beeg.porn/video/5416503',
'only_matching': True,
}, {
'url': 'https://beeg.porn/5416503',
'only_matching': True,
}]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
@ -35,11 +53,25 @@ class BeegIE(InfoExtractor):
r'beeg_version\s*=\s*([\da-zA-Z_-]+)', webpage, 'beeg version', r'beeg_version\s*=\s*([\da-zA-Z_-]+)', webpage, 'beeg version',
default='1546225636701') default='1546225636701')
if len(video_id) >= 10:
query = {
'v': 2,
}
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
t = qs.get('t', [''])[0].split('-')
if len(t) > 1:
query.update({
's': t[0],
'e': t[1],
})
else:
query = {'v': 1}
for api_path in ('', 'api.'): for api_path in ('', 'api.'):
video = self._download_json( video = self._download_json(
'https://%sbeeg.com/api/v6/%s/video/%s' 'https://%sbeeg.com/api/v6/%s/video/%s'
% (api_path, beeg_version, video_id), video_id, % (api_path, beeg_version, video_id), video_id,
fatal=api_path == 'api.') fatal=api_path == 'api.', query=query)
if video: if video:
break break

View File

@ -22,7 +22,8 @@ class BellMediaIE(InfoExtractor):
bravo| bravo|
mtv| mtv|
space| space|
etalk etalk|
marilyn
)\.ca| )\.ca|
much\.com much\.com
)/.*?(?:\bvid(?:eoid)?=|-vid|~|%7E|/(?:episode)?)(?P<id>[0-9]{6,})''' )/.*?(?:\bvid(?:eoid)?=|-vid|~|%7E|/(?:episode)?)(?P<id>[0-9]{6,})'''
@ -70,6 +71,7 @@ class BellMediaIE(InfoExtractor):
'animalplanet': 'aniplan', 'animalplanet': 'aniplan',
'etalk': 'ctv', 'etalk': 'ctv',
'bnnbloomberg': 'bnn', 'bnnbloomberg': 'bnn',
'marilyn': 'ctv_marilyn',
} }
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -0,0 +1,37 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import extract_attributes
class BFIPlayerIE(InfoExtractor):
IE_NAME = 'bfi:player'
_VALID_URL = r'https?://player\.bfi\.org\.uk/[^/]+/film/watch-(?P<id>[\w-]+)-online'
_TEST = {
'url': 'https://player.bfi.org.uk/free/film/watch-computer-doctor-1974-online',
'md5': 'e8783ebd8e061ec4bc6e9501ed547de8',
'info_dict': {
'id': 'htNnhlZjE60C9VySkQEIBtU-cNV1Xx63',
'ext': 'mp4',
'title': 'Computer Doctor',
'description': 'md5:fb6c240d40c4dbe40428bdd62f78203b',
},
'skip': 'BFI Player films cannot be played outside of the UK',
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
entries = []
for player_el in re.findall(r'(?s)<[^>]+class="player"[^>]*>', webpage):
player_attr = extract_attributes(player_el)
ooyala_id = player_attr.get('data-video-id')
if not ooyala_id:
continue
entries.append(self.url_result(
'ooyala:' + ooyala_id, 'Ooyala',
ooyala_id, player_attr.get('data-label')))
return self.playlist_result(entries)

View File

@ -15,6 +15,7 @@ from ..utils import (
float_or_none, float_or_none,
parse_iso8601, parse_iso8601,
smuggle_url, smuggle_url,
str_or_none,
strip_jsonp, strip_jsonp,
unified_timestamp, unified_timestamp,
unsmuggle_url, unsmuggle_url,
@ -93,8 +94,8 @@ class BiliBiliIE(InfoExtractor):
}] }]
}] }]
_APP_KEY = '84956560bc028eb7' _APP_KEY = 'iVGUTjsxvpLeuDCf'
_BILIBILI_KEY = '94aba54af9065f71de72f5508f1cd42e' _BILIBILI_KEY = 'aHRmhWMLkdeMuILqORnYZocwMBpMEOdt'
def _report_error(self, result): def _report_error(self, result):
if 'message' in result: if 'message' in result:
@ -306,3 +307,115 @@ class BiliBiliBangumiIE(InfoExtractor):
return self.playlist_result( return self.playlist_result(
entries, bangumi_id, entries, bangumi_id,
season_info.get('bangumi_title'), season_info.get('evaluate')) season_info.get('bangumi_title'), season_info.get('evaluate'))
class BilibiliAudioBaseIE(InfoExtractor):
def _call_api(self, path, sid, query=None):
if not query:
query = {'sid': sid}
return self._download_json(
'https://www.bilibili.com/audio/music-service-c/web/' + path,
sid, query=query)['data']
class BilibiliAudioIE(BilibiliAudioBaseIE):
_VALID_URL = r'https?://(?:www\.)?bilibili\.com/audio/au(?P<id>\d+)'
_TEST = {
'url': 'https://www.bilibili.com/audio/au1003142',
'md5': 'fec4987014ec94ef9e666d4d158ad03b',
'info_dict': {
'id': '1003142',
'ext': 'm4a',
'title': '【tsukimi】YELLOW / 神山羊',
'artist': 'tsukimi',
'comment_count': int,
'description': 'YELLOW的mp3版',
'duration': 183,
'subtitles': {
'origin': [{
'ext': 'lrc',
}],
},
'thumbnail': r're:^https?://.+\.jpg',
'timestamp': 1564836614,
'upload_date': '20190803',
'uploader': 'tsukimi-つきみぐー',
'view_count': int,
},
}
def _real_extract(self, url):
au_id = self._match_id(url)
play_data = self._call_api('url', au_id)
formats = [{
'url': play_data['cdns'][0],
'filesize': int_or_none(play_data.get('size')),
}]
song = self._call_api('song/info', au_id)
title = song['title']
statistic = song.get('statistic') or {}
subtitles = None
lyric = song.get('lyric')
if lyric:
subtitles = {
'origin': [{
'url': lyric,
}]
}
return {
'id': au_id,
'title': title,
'formats': formats,
'artist': song.get('author'),
'comment_count': int_or_none(statistic.get('comment')),
'description': song.get('intro'),
'duration': int_or_none(song.get('duration')),
'subtitles': subtitles,
'thumbnail': song.get('cover'),
'timestamp': int_or_none(song.get('passtime')),
'uploader': song.get('uname'),
'view_count': int_or_none(statistic.get('play')),
}
class BilibiliAudioAlbumIE(BilibiliAudioBaseIE):
_VALID_URL = r'https?://(?:www\.)?bilibili\.com/audio/am(?P<id>\d+)'
_TEST = {
'url': 'https://www.bilibili.com/audio/am10624',
'info_dict': {
'id': '10624',
'title': '每日新曲推荐每日11:00更新',
'description': '每天11:00更新为你推送最新音乐',
},
'playlist_count': 19,
}
def _real_extract(self, url):
am_id = self._match_id(url)
songs = self._call_api(
'song/of-menu', am_id, {'sid': am_id, 'pn': 1, 'ps': 100})['data']
entries = []
for song in songs:
sid = str_or_none(song.get('id'))
if not sid:
continue
entries.append(self.url_result(
'https://www.bilibili.com/audio/au' + sid,
BilibiliAudioIE.ie_key(), sid))
if entries:
album_data = self._call_api('menu/info', am_id) or {}
album_title = album_data.get('title')
if album_title:
for entry in entries:
entry['album'] = album_title
return self.playlist_result(
entries, am_id, album_title, album_data.get('intro'))
return self.playlist_result(entries, am_id)

View File

@ -6,7 +6,6 @@ from ..utils import (
ExtractorError, ExtractorError,
remove_end, remove_end,
) )
from .rudo import RudoIE
class BioBioChileTVIE(InfoExtractor): class BioBioChileTVIE(InfoExtractor):
@ -41,11 +40,15 @@ class BioBioChileTVIE(InfoExtractor):
}, { }, {
'url': 'http://www.biobiochile.cl/noticias/bbtv/comentarios-bio-bio/2016/07/08/edecanes-del-congreso-figuras-decorativas-que-le-cuestan-muy-caro-a-los-chilenos.shtml', 'url': 'http://www.biobiochile.cl/noticias/bbtv/comentarios-bio-bio/2016/07/08/edecanes-del-congreso-figuras-decorativas-que-le-cuestan-muy-caro-a-los-chilenos.shtml',
'info_dict': { 'info_dict': {
'id': 'edecanes-del-congreso-figuras-decorativas-que-le-cuestan-muy-caro-a-los-chilenos', 'id': 'b4xd0LK3SK',
'ext': 'mp4', 'ext': 'mp4',
'uploader': '(none)', # TODO: fix url_transparent information overriding
'upload_date': '20160708', # 'uploader': 'Juan Pablo Echenique',
'title': 'Edecanes del Congreso: Figuras decorativas que le cuestan muy caro a los chilenos', 'title': 'Comentario Oscar Cáceres',
},
'params': {
# empty m3u8 manifest
'skip_download': True,
}, },
}, { }, {
'url': 'http://tv.biobiochile.cl/notas/2015/10/22/ninos-transexuales-de-quien-es-la-decision.shtml', 'url': 'http://tv.biobiochile.cl/notas/2015/10/22/ninos-transexuales-de-quien-es-la-decision.shtml',
@ -60,7 +63,9 @@ class BioBioChileTVIE(InfoExtractor):
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
rudo_url = RudoIE._extract_url(webpage) rudo_url = self._search_regex(
r'<iframe[^>]+src=(?P<q1>[\'"])(?P<url>(?:https?:)?//rudo\.video/vod/[0-9a-zA-Z]+)(?P=q1)',
webpage, 'embed URL', None, group='url')
if not rudo_url: if not rudo_url:
raise ExtractorError('No videos found') raise ExtractorError('No videos found')
@ -68,7 +73,7 @@ class BioBioChileTVIE(InfoExtractor):
thumbnail = self._og_search_thumbnail(webpage) thumbnail = self._og_search_thumbnail(webpage)
uploader = self._html_search_regex( uploader = self._html_search_regex(
r'<a[^>]+href=["\']https?://(?:busca|www)\.biobiochile\.cl/(?:lista/)?(?:author|autor)[^>]+>(.+?)</a>', r'<a[^>]+href=["\'](?:https?://(?:busca|www)\.biobiochile\.cl)?/(?:lista/)?(?:author|autor)[^>]+>(.+?)</a>',
webpage, 'uploader', fatal=False) webpage, 'uploader', fatal=False)
return { return {

View File

@ -2,39 +2,96 @@
from __future__ import unicode_literals from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from .vk import VKIE
from ..utils import (
HEADRequest,
int_or_none,
)
class BIQLEIE(InfoExtractor): class BIQLEIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?biqle\.(?:com|org|ru)/watch/(?P<id>-?\d+_\d+)' _VALID_URL = r'https?://(?:www\.)?biqle\.(?:com|org|ru)/watch/(?P<id>-?\d+_\d+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.biqle.ru/watch/847655_160197695', # Youtube embed
'md5': 'ad5f746a874ccded7b8f211aeea96637', 'url': 'https://biqle.ru/watch/-115995369_456239081',
'md5': '97af5a06ee4c29bbf9c001bdb1cf5c06',
'info_dict': { 'info_dict': {
'id': '160197695', 'id': '8v4f-avW-VI',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Foo Fighters - The Pretender (Live at Wembley Stadium)', 'title': "PASSE-PARTOUT - L'ete c'est fait pour jouer",
'uploader': 'Andrey Rogozin', 'description': 'Passe-Partout',
'upload_date': '20110605', 'uploader_id': 'mrsimpsonstef3',
} 'uploader': 'Phanolito',
'upload_date': '20120822',
},
}, { }, {
'url': 'https://biqle.org/watch/-44781847_168547604', 'url': 'http://biqle.org/watch/-44781847_168547604',
'md5': '7f24e72af1db0edf7c1aaba513174f97', 'md5': '7f24e72af1db0edf7c1aaba513174f97',
'info_dict': { 'info_dict': {
'id': '168547604', 'id': '-44781847_168547604',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Ребенок в шоке от автоматической мойки', 'title': 'Ребенок в шоке от автоматической мойки',
'timestamp': 1396633454,
'uploader': 'Dmitry Kotov', 'uploader': 'Dmitry Kotov',
'upload_date': '20140404',
'uploader_id': '47850140',
}, },
'skip': ' This video was marked as adult. Embedding adult videos on external sites is prohibited.',
}] }]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
embed_url = self._proto_relative_url(self._search_regex( embed_url = self._proto_relative_url(self._search_regex(
r'<iframe.+?src="((?:http:)?//daxab\.com/[^"]+)".*?></iframe>', webpage, 'embed url')) r'<iframe.+?src="((?:https?:)?//(?:daxab\.com|dxb\.to|[^/]+/player)/[^"]+)".*?></iframe>',
webpage, 'embed url'))
if VKIE.suitable(embed_url):
return self.url_result(embed_url, VKIE.ie_key(), video_id)
self._request_webpage(
HEADRequest(embed_url), video_id, headers={'Referer': url})
video_id, sig, _, access_token = self._get_cookies(embed_url)['video_ext'].value.split('%3A')
item = self._download_json(
'https://api.vk.com/method/video.get', video_id,
headers={'User-Agent': 'okhttp/3.4.1'}, query={
'access_token': access_token,
'sig': sig,
'v': 5.44,
'videos': video_id,
})['response']['items'][0]
title = item['title']
formats = []
for f_id, f_url in item.get('files', {}).items():
if f_id == 'external':
return self.url_result(f_url)
ext, height = f_id.split('_')
formats.append({
'format_id': height + 'p',
'url': f_url,
'height': int_or_none(height),
'ext': ext,
})
self._sort_formats(formats)
thumbnails = []
for k, v in item.items():
if k.startswith('photo_') and v:
width = k.replace('photo_', '')
thumbnails.append({
'id': width,
'url': v,
'width': int_or_none(width),
})
return { return {
'_type': 'url_transparent', 'id': video_id,
'url': embed_url, 'title': title,
'formats': formats,
'comment_count': int_or_none(item.get('comments')),
'description': item.get('description'),
'duration': int_or_none(item.get('duration')),
'thumbnails': thumbnails,
'timestamp': int_or_none(item.get('date')),
'uploader': item.get('owner_id'),
'view_count': int_or_none(item.get('views')),
} }

View File

@ -7,6 +7,7 @@ import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
orderedSet, orderedSet,
unified_strdate,
urlencode_postdata, urlencode_postdata,
) )
@ -23,6 +24,7 @@ class BitChuteIE(InfoExtractor):
'description': 'md5:3f21f6fb5b1d17c3dee9cf6b5fe60b3a', 'description': 'md5:3f21f6fb5b1d17c3dee9cf6b5fe60b3a',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Victoria X Rave', 'uploader': 'Victoria X Rave',
'upload_date': '20170813',
}, },
}, { }, {
'url': 'https://www.bitchute.com/embed/lbb5G1hjPhw/', 'url': 'https://www.bitchute.com/embed/lbb5G1hjPhw/',
@ -55,6 +57,11 @@ class BitChuteIE(InfoExtractor):
formats = [ formats = [
{'url': format_url} {'url': format_url}
for format_url in orderedSet(format_urls)] for format_url in orderedSet(format_urls)]
if not formats:
formats = self._parse_html5_media_entries(
url, webpage, video_id)[0]['formats']
self._check_formats(formats, video_id) self._check_formats(formats, video_id)
self._sort_formats(formats) self._sort_formats(formats)
@ -65,8 +72,13 @@ class BitChuteIE(InfoExtractor):
webpage, default=None) or self._html_search_meta( webpage, default=None) or self._html_search_meta(
'twitter:image:src', webpage, 'thumbnail') 'twitter:image:src', webpage, 'thumbnail')
uploader = self._html_search_regex( uploader = self._html_search_regex(
r'(?s)<p\b[^>]+\bclass=["\']video-author[^>]+>(.+?)</p>', webpage, (r'(?s)<div class=["\']channel-banner.*?<p\b[^>]+\bclass=["\']name[^>]+>(.+?)</p>',
'uploader', fatal=False) r'(?s)<p\b[^>]+\bclass=["\']video-author[^>]+>(.+?)</p>'),
webpage, 'uploader', fatal=False)
upload_date = unified_strdate(self._search_regex(
r'class=["\']video-publish-date[^>]+>[^<]+ at \d+:\d+ UTC on (.+?)\.',
webpage, 'upload date', fatal=False))
return { return {
'id': video_id, 'id': video_id,
@ -74,6 +86,7 @@ class BitChuteIE(InfoExtractor):
'description': description, 'description': description,
'thumbnail': thumbnail, 'thumbnail': thumbnail,
'uploader': uploader, 'uploader': uploader,
'upload_date': upload_date,
'formats': formats, 'formats': formats,
} }

View File

@ -71,7 +71,7 @@ class BleacherReportIE(InfoExtractor):
video = article_data.get('video') video = article_data.get('video')
if video: if video:
video_type = video['type'] video_type = video['type']
if video_type == 'cms.bleacherreport.com': if video_type in ('cms.bleacherreport.com', 'vid.bleacherreport.com'):
info['url'] = 'http://bleacherreport.com/video_embed?id=%s' % video['id'] info['url'] = 'http://bleacherreport.com/video_embed?id=%s' % video['id']
elif video_type == 'ooyala.com': elif video_type == 'ooyala.com':
info['url'] = 'ooyala:%s' % video['id'] info['url'] = 'ooyala:%s' % video['id']
@ -87,9 +87,9 @@ class BleacherReportIE(InfoExtractor):
class BleacherReportCMSIE(AMPIE): class BleacherReportCMSIE(AMPIE):
_VALID_URL = r'https?://(?:www\.)?bleacherreport\.com/video_embed\?id=(?P<id>[0-9a-f-]{36})' _VALID_URL = r'https?://(?:www\.)?bleacherreport\.com/video_embed\?id=(?P<id>[0-9a-f-]{36}|\d{5})'
_TESTS = [{ _TESTS = [{
'url': 'http://bleacherreport.com/video_embed?id=8fd44c2f-3dc5-4821-9118-2c825a98c0e1', 'url': 'http://bleacherreport.com/video_embed?id=8fd44c2f-3dc5-4821-9118-2c825a98c0e1&library=video-cms',
'md5': '2e4b0a997f9228ffa31fada5c53d1ed1', 'md5': '2e4b0a997f9228ffa31fada5c53d1ed1',
'info_dict': { 'info_dict': {
'id': '8fd44c2f-3dc5-4821-9118-2c825a98c0e1', 'id': '8fd44c2f-3dc5-4821-9118-2c825a98c0e1',
@ -101,6 +101,6 @@ class BleacherReportCMSIE(AMPIE):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
info = self._extract_feed_info('http://cms.bleacherreport.com/media/items/%s/akamai.json' % video_id) info = self._extract_feed_info('http://vid.bleacherreport.com/videos/%s.akamai' % video_id)
info['id'] = video_id info['id'] = video_id
return info return info

View File

@ -32,8 +32,8 @@ class BlinkxIE(InfoExtractor):
video_id = self._match_id(url) video_id = self._match_id(url)
display_id = video_id[:8] display_id = video_id[:8]
api_url = ('https://apib4.blinkx.com/api.php?action=play_video&' + api_url = ('https://apib4.blinkx.com/api.php?action=play_video&'
'video=%s' % video_id) + 'video=%s' % video_id)
data_json = self._download_webpage(api_url, display_id) data_json = self._download_webpage(api_url, display_id)
data = json.loads(data_json)['api']['results'][0] data = json.loads(data_json)['api']['results'][0]
duration = None duration = None

View File

@ -11,8 +11,8 @@ from ..utils import ExtractorError
class BokeCCBaseIE(InfoExtractor): class BokeCCBaseIE(InfoExtractor):
def _extract_bokecc_formats(self, webpage, video_id, format_id=None): def _extract_bokecc_formats(self, webpage, video_id, format_id=None):
player_params_str = self._html_search_regex( player_params_str = self._html_search_regex(
r'<(?:script|embed)[^>]+src="http://p\.bokecc\.com/player\?([^"]+)', r'<(?:script|embed)[^>]+src=(?P<q>["\'])(?:https?:)?//p\.bokecc\.com/(?:player|flash/player\.swf)\?(?P<query>.+?)(?P=q)',
webpage, 'player params') webpage, 'player params', group='query')
player_params = compat_parse_qs(player_params_str) player_params = compat_parse_qs(player_params_str)
@ -36,9 +36,9 @@ class BokeCCIE(BokeCCBaseIE):
_VALID_URL = r'https?://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)' _VALID_URL = r'https?://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)'
_TESTS = [{ _TESTS = [{
'url': 'http://union.bokecc.com/playvideo.bo?vid=E44D40C15E65EA30&uid=CD0C5D3C8614B28B', 'url': 'http://union.bokecc.com/playvideo.bo?vid=E0ABAE9D4F509B189C33DC5901307461&uid=FE644790DE9D154A',
'info_dict': { 'info_dict': {
'id': 'CD0C5D3C8614B28B_E44D40C15E65EA30', 'id': 'FE644790DE9D154A_E0ABAE9D4F509B189C33DC5901307461',
'ext': 'flv', 'ext': 'flv',
'title': 'BokeCC Video', 'title': 'BokeCC Video',
}, },

View File

@ -1,6 +1,8 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .adobepass import AdobePassIE from .adobepass import AdobePassIE
from ..utils import ( from ..utils import (
smuggle_url, smuggle_url,
@ -12,16 +14,16 @@ from ..utils import (
class BravoTVIE(AdobePassIE): class BravoTVIE(AdobePassIE):
_VALID_URL = r'https?://(?:www\.)?bravotv\.com/(?:[^/]+/)+(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:www\.)?bravotv\.com/(?:[^/]+/)+(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.bravotv.com/last-chance-kitchen/season-5/videos/lck-ep-12-fishy-finale', 'url': 'https://www.bravotv.com/top-chef/season-16/episode-15/videos/the-top-chef-season-16-winner-is',
'md5': '9086d0b7ef0ea2aabc4781d75f4e5863', 'md5': 'e34684cfea2a96cd2ee1ef3a60909de9',
'info_dict': { 'info_dict': {
'id': 'zHyk1_HU_mPy', 'id': 'epL0pmK1kQlT',
'ext': 'mp4', 'ext': 'mp4',
'title': 'LCK Ep 12: Fishy Finale', 'title': 'The Top Chef Season 16 Winner Is...',
'description': 'S13/E12: Two eliminated chefs have just 12 minutes to cook up a delicious fish dish.', 'description': 'Find out who takes the title of Top Chef!',
'uploader': 'NBCU-BRAV', 'uploader': 'NBCU-BRAV',
'upload_date': '20160302', 'upload_date': '20190314',
'timestamp': 1456945320, 'timestamp': 1552591860,
} }
}, { }, {
'url': 'http://www.bravotv.com/below-deck/season-3/ep-14-reunion-part-1', 'url': 'http://www.bravotv.com/below-deck/season-3/ep-14-reunion-part-1',
@ -32,30 +34,38 @@ class BravoTVIE(AdobePassIE):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
settings = self._parse_json(self._search_regex( settings = self._parse_json(self._search_regex(
r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);', webpage, 'drupal settings'), r'<script[^>]+data-drupal-selector="drupal-settings-json"[^>]*>({.+?})</script>', webpage, 'drupal settings'),
display_id) display_id)
info = {} info = {}
query = { query = {
'mbr': 'true', 'mbr': 'true',
} }
account_pid, release_pid = [None] * 2 account_pid, release_pid = [None] * 2
tve = settings.get('sharedTVE') tve = settings.get('ls_tve')
if tve: if tve:
query['manifest'] = 'm3u' query['manifest'] = 'm3u'
mobj = re.search(r'<[^>]+id="pdk-player"[^>]+data-url=["\']?(?:https?:)?//player\.theplatform\.com/p/([^/]+)/(?:[^/]+/)*select/([^?#&"\']+)', webpage)
if mobj:
account_pid, tp_path = mobj.groups()
release_pid = tp_path.strip('/').split('/')[-1]
else:
account_pid = 'HNK2IC' account_pid = 'HNK2IC'
release_pid = tve['release_pid'] tp_path = release_pid = tve['release_pid']
if tve.get('entitlement') == 'auth': if tve.get('entitlement') == 'auth':
adobe_pass = settings.get('adobePass', {}) adobe_pass = settings.get('tve_adobe_auth', {})
resource = self._get_mvpd_resource( resource = self._get_mvpd_resource(
adobe_pass.get('adobePassResourceId', 'bravo'), adobe_pass.get('adobePassResourceId', 'bravo'),
tve['title'], release_pid, tve.get('rating')) tve['title'], release_pid, tve.get('rating'))
query['auth'] = self._extract_mvpd_auth( query['auth'] = self._extract_mvpd_auth(
url, release_pid, adobe_pass.get('adobePassRequestorId', 'bravo'), resource) url, release_pid, adobe_pass.get('adobePassRequestorId', 'bravo'), resource)
else: else:
shared_playlist = settings['shared_playlist'] shared_playlist = settings['ls_playlist']
account_pid = shared_playlist['account_pid'] account_pid = shared_playlist['account_pid']
metadata = shared_playlist['video_metadata'][shared_playlist['default_clip']] metadata = shared_playlist['video_metadata'][shared_playlist['default_clip']]
release_pid = metadata['release_pid'] tp_path = release_pid = metadata.get('release_pid')
if not release_pid:
release_pid = metadata['guid']
tp_path = 'media/guid/2140479951/' + release_pid
info.update({ info.update({
'title': metadata['title'], 'title': metadata['title'],
'description': metadata.get('description'), 'description': metadata.get('description'),
@ -67,7 +77,7 @@ class BravoTVIE(AdobePassIE):
'_type': 'url_transparent', '_type': 'url_transparent',
'id': release_pid, 'id': release_pid,
'url': smuggle_url(update_url_query( 'url': smuggle_url(update_url_query(
'http://link.theplatform.com/s/%s/%s' % (account_pid, release_pid), 'http://link.theplatform.com/s/%s/%s' % (account_pid, tp_path),
query), {'force_smil_url': True}), query), {'force_smil_url': True}),
'ie_key': 'ThePlatform', 'ie_key': 'ThePlatform',
}) })

View File

@ -2,7 +2,6 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import base64 import base64
import json
import re import re
import struct import struct
@ -11,14 +10,12 @@ from .adobepass import AdobePassIE
from ..compat import ( from ..compat import (
compat_etree_fromstring, compat_etree_fromstring,
compat_parse_qs, compat_parse_qs,
compat_str,
compat_urllib_parse_urlparse, compat_urllib_parse_urlparse,
compat_urlparse, compat_urlparse,
compat_xml_parse_error, compat_xml_parse_error,
compat_HTTPError, compat_HTTPError,
) )
from ..utils import ( from ..utils import (
determine_ext,
ExtractorError, ExtractorError,
extract_attributes, extract_attributes,
find_xpath_attr, find_xpath_attr,
@ -27,18 +24,19 @@ from ..utils import (
js_to_json, js_to_json,
int_or_none, int_or_none,
parse_iso8601, parse_iso8601,
smuggle_url,
unescapeHTML, unescapeHTML,
unsmuggle_url, unsmuggle_url,
update_url_query, update_url_query,
clean_html, clean_html,
mimetype2ext, mimetype2ext,
UnsupportedError,
) )
class BrightcoveLegacyIE(InfoExtractor): class BrightcoveLegacyIE(InfoExtractor):
IE_NAME = 'brightcove:legacy' IE_NAME = 'brightcove:legacy'
_VALID_URL = r'(?:https?://.*brightcove\.com/(services|viewer).*?\?|brightcove:)(?P<query>.*)' _VALID_URL = r'(?:https?://.*brightcove\.com/(services|viewer).*?\?|brightcove:)(?P<query>.*)'
_FEDERATED_URL = 'http://c.brightcove.com/services/viewer/htmlFederated'
_TESTS = [ _TESTS = [
{ {
@ -55,7 +53,8 @@ class BrightcoveLegacyIE(InfoExtractor):
'timestamp': 1368213670, 'timestamp': 1368213670,
'upload_date': '20130510', 'upload_date': '20130510',
'uploader_id': '1589608506001', 'uploader_id': '1589608506001',
} },
'skip': 'The player has been deactivated by the content owner',
}, },
{ {
# From http://medianetwork.oracle.com/video/player/1785452137001 # From http://medianetwork.oracle.com/video/player/1785452137001
@ -70,6 +69,7 @@ class BrightcoveLegacyIE(InfoExtractor):
'upload_date': '20120814', 'upload_date': '20120814',
'uploader_id': '1460825906', 'uploader_id': '1460825906',
}, },
'skip': 'video not playable',
}, },
{ {
# From http://mashable.com/2013/10/26/thermoelectric-bracelet-lets-you-control-your-body-temperature/ # From http://mashable.com/2013/10/26/thermoelectric-bracelet-lets-you-control-your-body-temperature/
@ -79,7 +79,7 @@ class BrightcoveLegacyIE(InfoExtractor):
'ext': 'mp4', 'ext': 'mp4',
'title': 'This Bracelet Acts as a Personal Thermostat', 'title': 'This Bracelet Acts as a Personal Thermostat',
'description': 'md5:547b78c64f4112766ccf4e151c20b6a0', 'description': 'md5:547b78c64f4112766ccf4e151c20b6a0',
'uploader': 'Mashable', # 'uploader': 'Mashable',
'timestamp': 1382041798, 'timestamp': 1382041798,
'upload_date': '20131017', 'upload_date': '20131017',
'uploader_id': '1130468786001', 'uploader_id': '1130468786001',
@ -124,15 +124,17 @@ class BrightcoveLegacyIE(InfoExtractor):
'id': '3550319591001', 'id': '3550319591001',
}, },
'playlist_mincount': 7, 'playlist_mincount': 7,
'skip': 'Unsupported URL',
}, },
{ {
# playlist with 'playlistTab' (https://github.com/rg3/youtube-dl/issues/9965) # playlist with 'playlistTab' (https://github.com/ytdl-org/youtube-dl/issues/9965)
'url': 'http://c.brightcove.com/services/json/experience/runtime/?command=get_programming_for_experience&playerKey=AQ%7E%7E,AAABXlLMdok%7E,NJ4EoMlZ4rZdx9eU1rkMVd8EaYPBBUlg', 'url': 'http://c.brightcove.com/services/json/experience/runtime/?command=get_programming_for_experience&playerKey=AQ%7E%7E,AAABXlLMdok%7E,NJ4EoMlZ4rZdx9eU1rkMVd8EaYPBBUlg',
'info_dict': { 'info_dict': {
'id': '1522758701001', 'id': '1522758701001',
'title': 'Lesson 08', 'title': 'Lesson 08',
}, },
'playlist_mincount': 10, 'playlist_mincount': 10,
'skip': 'Unsupported URL',
}, },
{ {
# playerID inferred from bcpid # playerID inferred from bcpid
@ -141,12 +143,6 @@ class BrightcoveLegacyIE(InfoExtractor):
'only_matching': True, # Tested in GenericIE 'only_matching': True, # Tested in GenericIE
} }
] ]
FLV_VCODECS = {
1: 'SORENSON',
2: 'ON2',
3: 'H264',
4: 'VP8',
}
@classmethod @classmethod
def _build_brighcove_url(cls, object_str): def _build_brighcove_url(cls, object_str):
@ -155,10 +151,10 @@ class BrightcoveLegacyIE(InfoExtractor):
<object class="BrightcoveExperience">{params}</object> <object class="BrightcoveExperience">{params}</object>
""" """
# Fix up some stupid HTML, see https://github.com/rg3/youtube-dl/issues/1553 # Fix up some stupid HTML, see https://github.com/ytdl-org/youtube-dl/issues/1553
object_str = re.sub(r'(<param(?:\s+[a-zA-Z0-9_]+="[^"]*")*)>', object_str = re.sub(r'(<param(?:\s+[a-zA-Z0-9_]+="[^"]*")*)>',
lambda m: m.group(1) + '/>', object_str) lambda m: m.group(1) + '/>', object_str)
# Fix up some stupid XML, see https://github.com/rg3/youtube-dl/issues/1608 # Fix up some stupid XML, see https://github.com/ytdl-org/youtube-dl/issues/1608
object_str = object_str.replace('<--', '<!--') object_str = object_str.replace('<--', '<!--')
# remove namespace to simplify extraction # remove namespace to simplify extraction
object_str = re.sub(r'(<object[^>]*)(xmlns=".*?")', r'\1', object_str) object_str = re.sub(r'(<object[^>]*)(xmlns=".*?")', r'\1', object_str)
@ -238,7 +234,8 @@ class BrightcoveLegacyIE(InfoExtractor):
@classmethod @classmethod
def _make_brightcove_url(cls, params): def _make_brightcove_url(cls, params):
return update_url_query(cls._FEDERATED_URL, params) return update_url_query(
'http://c.brightcove.com/services/viewer/htmlFederated', params)
@classmethod @classmethod
def _extract_brightcove_url(cls, webpage): def _extract_brightcove_url(cls, webpage):
@ -297,38 +294,12 @@ class BrightcoveLegacyIE(InfoExtractor):
videoPlayer = query.get('@videoPlayer') videoPlayer = query.get('@videoPlayer')
if videoPlayer: if videoPlayer:
# We set the original url as the default 'Referer' header # We set the original url as the default 'Referer' header
referer = smuggled_data.get('Referer', url) referer = query.get('linkBaseURL', [None])[0] or smuggled_data.get('Referer', url)
video_id = videoPlayer[0]
if 'playerID' not in query: if 'playerID' not in query:
mobj = re.search(r'/bcpid(\d+)', url) mobj = re.search(r'/bcpid(\d+)', url)
if mobj is not None: if mobj is not None:
query['playerID'] = [mobj.group(1)] query['playerID'] = [mobj.group(1)]
return self._get_video_info(
videoPlayer[0], query, referer=referer)
elif 'playerKey' in query:
player_key = query['playerKey']
return self._get_playlist_info(player_key[0])
else:
raise ExtractorError(
'Cannot find playerKey= variable. Did you forget quotes in a shell invocation?',
expected=True)
def _brightcove_new_url_result(self, publisher_id, video_id):
brightcove_new_url = 'http://players.brightcove.net/%s/default_default/index.html?videoId=%s' % (publisher_id, video_id)
return self.url_result(brightcove_new_url, BrightcoveNewIE.ie_key(), video_id)
def _get_video_info(self, video_id, query, referer=None):
headers = {}
linkBase = query.get('linkBaseURL')
if linkBase is not None:
referer = linkBase[0]
if referer is not None:
headers['Referer'] = referer
webpage = self._download_webpage(self._FEDERATED_URL, video_id, headers=headers, query=query)
error_msg = self._html_search_regex(
r"<h1>We're sorry.</h1>([\s\n]*<p>.*?</p>)+", webpage,
'error message', default=None)
if error_msg is not None:
publisher_id = query.get('publisherId') publisher_id = query.get('publisherId')
if publisher_id and publisher_id[0].isdigit(): if publisher_id and publisher_id[0].isdigit():
publisher_id = publisher_id[0] publisher_id = publisher_id[0]
@ -339,6 +310,9 @@ class BrightcoveLegacyIE(InfoExtractor):
else: else:
player_id = query.get('playerID') player_id = query.get('playerID')
if player_id and player_id[0].isdigit(): if player_id and player_id[0].isdigit():
headers = {}
if referer:
headers['Referer'] = referer
player_page = self._download_webpage( player_page = self._download_webpage(
'http://link.brightcove.com/services/player/bcpid' + player_id[0], 'http://link.brightcove.com/services/player/bcpid' + player_id[0],
video_id, headers=headers, fatal=False) video_id, headers=headers, fatal=False)
@ -350,140 +324,20 @@ class BrightcoveLegacyIE(InfoExtractor):
enc_pub_id = player_key.split(',')[1].replace('~', '=') enc_pub_id = player_key.split(',')[1].replace('~', '=')
publisher_id = struct.unpack('>Q', base64.urlsafe_b64decode(enc_pub_id))[0] publisher_id = struct.unpack('>Q', base64.urlsafe_b64decode(enc_pub_id))[0]
if publisher_id: if publisher_id:
return self._brightcove_new_url_result(publisher_id, video_id) brightcove_new_url = 'http://players.brightcove.net/%s/default_default/index.html?videoId=%s' % (publisher_id, video_id)
raise ExtractorError( if referer:
'brightcove said: %s' % error_msg, expected=True) brightcove_new_url = smuggle_url(brightcove_new_url, {'referrer': referer})
return self.url_result(brightcove_new_url, BrightcoveNewIE.ie_key(), video_id)
self.report_extraction(video_id) # TODO: figure out if it's possible to extract playlistId from playerKey
info = self._search_regex(r'var experienceJSON = ({.*});', webpage, 'json') # elif 'playerKey' in query:
info = json.loads(info)['data'] # player_key = query['playerKey']
video_info = info['programmedContent']['videoPlayer']['mediaDTO'] # return self._get_playlist_info(player_key[0])
video_info['_youtubedl_adServerURL'] = info.get('adServerURL') raise UnsupportedError(url)
return self._extract_video_info(video_info)
def _get_playlist_info(self, player_key):
info_url = 'http://c.brightcove.com/services/json/experience/runtime/?command=get_programming_for_experience&playerKey=%s' % player_key
playlist_info = self._download_webpage(
info_url, player_key, 'Downloading playlist information')
json_data = json.loads(playlist_info)
if 'videoList' in json_data:
playlist_info = json_data['videoList']
playlist_dto = playlist_info['mediaCollectionDTO']
elif 'playlistTabs' in json_data:
playlist_info = json_data['playlistTabs']
playlist_dto = playlist_info['lineupListDTO']['playlistDTOs'][0]
else:
raise ExtractorError('Empty playlist')
videos = [self._extract_video_info(video_info) for video_info in playlist_dto['videoDTOs']]
return self.playlist_result(videos, playlist_id='%s' % playlist_info['id'],
playlist_title=playlist_dto['displayName'])
def _extract_video_info(self, video_info):
video_id = compat_str(video_info['id'])
publisher_id = video_info.get('publisherId')
info = {
'id': video_id,
'title': video_info['displayName'].strip(),
'description': video_info.get('shortDescription'),
'thumbnail': video_info.get('videoStillURL') or video_info.get('thumbnailURL'),
'uploader': video_info.get('publisherName'),
'uploader_id': compat_str(publisher_id) if publisher_id else None,
'duration': float_or_none(video_info.get('length'), 1000),
'timestamp': int_or_none(video_info.get('creationDate'), 1000),
}
renditions = video_info.get('renditions', []) + video_info.get('IOSRenditions', [])
if renditions:
formats = []
for rend in renditions:
url = rend['defaultURL']
if not url:
continue
ext = None
if rend['remote']:
url_comp = compat_urllib_parse_urlparse(url)
if url_comp.path.endswith('.m3u8'):
formats.extend(
self._extract_m3u8_formats(
url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
continue
elif 'akamaihd.net' in url_comp.netloc:
# This type of renditions are served through
# akamaihd.net, but they don't use f4m manifests
url = url.replace('control/', '') + '?&v=3.3.0&fp=13&r=FEEFJ&g=RTSJIMBMPFPB'
ext = 'flv'
if ext is None:
ext = determine_ext(url)
tbr = int_or_none(rend.get('encodingRate'), 1000)
a_format = {
'format_id': 'http%s' % ('-%s' % tbr if tbr else ''),
'url': url,
'ext': ext,
'filesize': int_or_none(rend.get('size')) or None,
'tbr': tbr,
}
if rend.get('audioOnly'):
a_format.update({
'vcodec': 'none',
})
else:
a_format.update({
'height': int_or_none(rend.get('frameHeight')),
'width': int_or_none(rend.get('frameWidth')),
'vcodec': rend.get('videoCodec'),
})
# m3u8 manifests with remote == false are media playlists
# Not calling _extract_m3u8_formats here to save network traffic
if ext == 'm3u8':
a_format.update({
'format_id': 'hls%s' % ('-%s' % tbr if tbr else ''),
'ext': 'mp4',
'protocol': 'm3u8_native',
})
formats.append(a_format)
self._sort_formats(formats)
info['formats'] = formats
elif video_info.get('FLVFullLengthURL') is not None:
info.update({
'url': video_info['FLVFullLengthURL'],
'vcodec': self.FLV_VCODECS.get(video_info.get('FLVFullCodec')),
'filesize': int_or_none(video_info.get('FLVFullSize')),
})
if self._downloader.params.get('include_ads', False):
adServerURL = video_info.get('_youtubedl_adServerURL')
if adServerURL:
ad_info = {
'_type': 'url',
'url': adServerURL,
}
if 'url' in info:
return {
'_type': 'playlist',
'title': info['title'],
'entries': [ad_info, info],
}
else:
return ad_info
if not info.get('url') and not info.get('formats'):
uploader_id = info.get('uploader_id')
if uploader_id:
info.update(self._brightcove_new_url_result(uploader_id, video_id))
else:
raise ExtractorError('Unable to extract video url for %s' % video_id)
return info
class BrightcoveNewIE(AdobePassIE): class BrightcoveNewIE(AdobePassIE):
IE_NAME = 'brightcove:new' IE_NAME = 'brightcove:new'
_VALID_URL = r'https?://players\.brightcove\.net/(?P<account_id>\d+)/(?P<player_id>[^/]+)_(?P<embed>[^/]+)/index\.html\?.*videoId=(?P<video_id>\d+|ref:[^&]+)' _VALID_URL = r'https?://players\.brightcove\.net/(?P<account_id>\d+)/(?P<player_id>[^/]+)_(?P<embed>[^/]+)/index\.html\?.*(?P<content_type>video|playlist)Id=(?P<video_id>\d+|ref:[^&]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://players.brightcove.net/929656772001/e41d32dc-ec74-459e-a845-6c69f7b724ea_default/index.html?videoId=4463358922001', 'url': 'http://players.brightcove.net/929656772001/e41d32dc-ec74-459e-a845-6c69f7b724ea_default/index.html?videoId=4463358922001',
'md5': 'c8100925723840d4b0d243f7025703be', 'md5': 'c8100925723840d4b0d243f7025703be',
@ -516,6 +370,21 @@ class BrightcoveNewIE(AdobePassIE):
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
} }
}, {
# playlist stream
'url': 'https://players.brightcove.net/1752604059001/S13cJdUBz_default/index.html?playlistId=5718313430001',
'info_dict': {
'id': '5718313430001',
'title': 'No Audio Playlist',
},
'playlist_count': 7,
'params': {
# m3u8 download
'skip_download': True,
}
}, {
'url': 'http://players.brightcove.net/5690807595001/HyZNerRl7_default/index.html?playlistId=5743160747001',
'only_matching': True,
}, { }, {
# ref: prefixed video id # ref: prefixed video id
'url': 'http://players.brightcove.net/3910869709001/21519b5c-4b3b-4363-accb-bdc8f358f823_default/index.html?videoId=ref:7069442', 'url': 'http://players.brightcove.net/3910869709001/21519b5c-4b3b-4363-accb-bdc8f358f823_default/index.html?videoId=ref:7069442',
@ -715,8 +584,14 @@ class BrightcoveNewIE(AdobePassIE):
'ip_blocks': smuggled_data.get('geo_ip_blocks'), 'ip_blocks': smuggled_data.get('geo_ip_blocks'),
}) })
account_id, player_id, embed, video_id = re.match(self._VALID_URL, url).groups() account_id, player_id, embed, content_type, video_id = re.match(self._VALID_URL, url).groups()
policy_key_id = '%s_%s' % (account_id, player_id)
policy_key = self._downloader.cache.load('brightcove', policy_key_id)
policy_key_extracted = False
store_pk = lambda x: self._downloader.cache.store('brightcove', policy_key_id, x)
def extract_policy_key():
webpage = self._download_webpage( webpage = self._download_webpage(
'http://players.brightcove.net/%s/%s_%s/index.min.js' 'http://players.brightcove.net/%s/%s_%s/index.min.js'
% (account_id, player_id, embed), video_id) % (account_id, player_id, embed), video_id)
@ -736,24 +611,36 @@ class BrightcoveNewIE(AdobePassIE):
r'policyKey\s*:\s*(["\'])(?P<pk>.+?)\1', r'policyKey\s*:\s*(["\'])(?P<pk>.+?)\1',
webpage, 'policy key', group='pk') webpage, 'policy key', group='pk')
api_url = 'https://edge.api.brightcove.com/playback/v1/accounts/%s/videos/%s' % (account_id, video_id) store_pk(policy_key)
headers = { return policy_key
'Accept': 'application/json;pk=%s' % policy_key,
} api_url = 'https://edge.api.brightcove.com/playback/v1/accounts/%s/%ss/%s' % (account_id, content_type, video_id)
headers = {}
referrer = smuggled_data.get('referrer') referrer = smuggled_data.get('referrer')
if referrer: if referrer:
headers.update({ headers.update({
'Referer': referrer, 'Referer': referrer,
'Origin': re.search(r'https?://[^/]+', referrer).group(0), 'Origin': re.search(r'https?://[^/]+', referrer).group(0),
}) })
for _ in range(2):
if not policy_key:
policy_key = extract_policy_key()
policy_key_extracted = True
headers['Accept'] = 'application/json;pk=%s' % policy_key
try: try:
json_data = self._download_json(api_url, video_id, headers=headers) json_data = self._download_json(api_url, video_id, headers=headers)
break
except ExtractorError as e: except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403: if isinstance(e.cause, compat_HTTPError) and e.cause.code in (401, 403):
json_data = self._parse_json(e.cause.read().decode(), video_id)[0] json_data = self._parse_json(e.cause.read().decode(), video_id)[0]
message = json_data.get('message') or json_data['error_code'] message = json_data.get('message') or json_data['error_code']
if json_data.get('error_subcode') == 'CLIENT_GEO': if json_data.get('error_subcode') == 'CLIENT_GEO':
self.raise_geo_restricted(msg=message) self.raise_geo_restricted(msg=message)
elif json_data.get('error_code') == 'INVALID_POLICY_KEY' and not policy_key_extracted:
policy_key = None
store_pk(None)
continue
raise ExtractorError(message, expected=True) raise ExtractorError(message, expected=True)
raise raise
@ -771,5 +658,12 @@ class BrightcoveNewIE(AdobePassIE):
'tveToken': tve_token, 'tveToken': tve_token,
}) })
if content_type == 'playlist':
return self.playlist_result(
[self._parse_brightcove_metadata(vid, vid.get('id'), headers)
for vid in json_data.get('videos', []) if vid.get('id')],
json_data.get('id'), json_data.get('name'),
json_data.get('description'))
return self._parse_brightcove_metadata( return self._parse_brightcove_metadata(
json_data, video_id, headers=headers) json_data, video_id, headers=headers)

View File

@ -3,11 +3,18 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import (
determine_ext,
merge_dicts,
parse_duration,
url_or_none,
)
class BYUtvIE(InfoExtractor): class BYUtvIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?byutv\.org/(?:watch|player)/(?!event/)(?P<id>[0-9a-f-]+)(?:/(?P<display_id>[^/?#&]+))?' _VALID_URL = r'https?://(?:www\.)?byutv\.org/(?:watch|player)/(?!event/)(?P<id>[0-9a-f-]+)(?:/(?P<display_id>[^/?#&]+))?'
_TESTS = [{ _TESTS = [{
# ooyalaVOD
'url': 'http://www.byutv.org/watch/6587b9a3-89d2-42a6-a7f7-fd2f81840a7d/studio-c-season-5-episode-5', 'url': 'http://www.byutv.org/watch/6587b9a3-89d2-42a6-a7f7-fd2f81840a7d/studio-c-season-5-episode-5',
'info_dict': { 'info_dict': {
'id': 'ZvanRocTpW-G5_yZFeltTAMv6jxOU9KH', 'id': 'ZvanRocTpW-G5_yZFeltTAMv6jxOU9KH',
@ -22,6 +29,20 @@ class BYUtvIE(InfoExtractor):
'skip_download': True, 'skip_download': True,
}, },
'add_ie': ['Ooyala'], 'add_ie': ['Ooyala'],
}, {
# dvr
'url': 'https://www.byutv.org/player/8f1dab9b-b243-47c8-b525-3e2d021a3451/byu-softball-pacific-vs-byu-41219---game-2',
'info_dict': {
'id': '8f1dab9b-b243-47c8-b525-3e2d021a3451',
'display_id': 'byu-softball-pacific-vs-byu-41219---game-2',
'ext': 'mp4',
'title': 'Pacific vs. BYU (4/12/19)',
'description': 'md5:1ac7b57cb9a78015910a4834790ce1f3',
'duration': 11645,
},
'params': {
'skip_download': True
},
}, { }, {
'url': 'http://www.byutv.org/watch/6587b9a3-89d2-42a6-a7f7-fd2f81840a7d', 'url': 'http://www.byutv.org/watch/6587b9a3-89d2-42a6-a7f7-fd2f81840a7d',
'only_matching': True, 'only_matching': True,
@ -35,17 +56,19 @@ class BYUtvIE(InfoExtractor):
video_id = mobj.group('id') video_id = mobj.group('id')
display_id = mobj.group('display_id') or video_id display_id = mobj.group('display_id') or video_id
ep = self._download_json( video = self._download_json(
'https://api.byutv.org/api3/catalog/getvideosforcontent', video_id, 'https://api.byutv.org/api3/catalog/getvideosforcontent',
query={ display_id, query={
'contentid': video_id, 'contentid': video_id,
'channel': 'byutv', 'channel': 'byutv',
'x-byutv-context': 'web$US', 'x-byutv-context': 'web$US',
}, headers={ }, headers={
'x-byutv-context': 'web$US', 'x-byutv-context': 'web$US',
'x-byutv-platformkey': 'xsaaw9c7y5', 'x-byutv-platformkey': 'xsaaw9c7y5',
})['ooyalaVOD'] })
ep = video.get('ooyalaVOD')
if ep:
return { return {
'_type': 'url_transparent', '_type': 'url_transparent',
'ie_key': 'Ooyala', 'ie_key': 'Ooyala',
@ -56,3 +79,39 @@ class BYUtvIE(InfoExtractor):
'description': ep.get('description'), 'description': ep.get('description'),
'thumbnail': ep.get('imageThumbnail'), 'thumbnail': ep.get('imageThumbnail'),
} }
info = {}
formats = []
for format_id, ep in video.items():
if not isinstance(ep, dict):
continue
video_url = url_or_none(ep.get('videoUrl'))
if not video_url:
continue
ext = determine_ext(video_url)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
elif ext == 'mpd':
formats.extend(self._extract_mpd_formats(
video_url, video_id, mpd_id='dash', fatal=False))
else:
formats.append({
'url': video_url,
'format_id': format_id,
})
merge_dicts(info, {
'title': ep.get('title'),
'description': ep.get('description'),
'thumbnail': ep.get('imageThumbnail'),
'duration': parse_duration(ep.get('length')),
})
self._sort_formats(formats)
return merge_dicts(info, {
'id': video_id,
'display_id': display_id,
'title': display_id,
'formats': formats,
})

View File

@ -17,7 +17,7 @@ from ..utils import (
class CanvasIE(InfoExtractor): class CanvasIE(InfoExtractor):
_VALID_URL = r'https?://mediazone\.vrt\.be/api/v1/(?P<site_id>canvas|een|ketnet|vrtvideo)/assets/(?P<id>[^/?#&]+)' _VALID_URL = r'https?://mediazone\.vrt\.be/api/v1/(?P<site_id>canvas|een|ketnet|vrt(?:video|nieuws)|sporza)/assets/(?P<id>[^/?#&]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://mediazone.vrt.be/api/v1/ketnet/assets/md-ast-4ac54990-ce66-4d00-a8ca-9eac86f4c475', 'url': 'https://mediazone.vrt.be/api/v1/ketnet/assets/md-ast-4ac54990-ce66-4d00-a8ca-9eac86f4c475',
'md5': '90139b746a0a9bd7bb631283f6e2a64e', 'md5': '90139b746a0a9bd7bb631283f6e2a64e',
@ -35,6 +35,10 @@ class CanvasIE(InfoExtractor):
'url': 'https://mediazone.vrt.be/api/v1/canvas/assets/mz-ast-5e5f90b6-2d72-4c40-82c2-e134f884e93e', 'url': 'https://mediazone.vrt.be/api/v1/canvas/assets/mz-ast-5e5f90b6-2d72-4c40-82c2-e134f884e93e',
'only_matching': True, 'only_matching': True,
}] }]
_HLS_ENTRY_PROTOCOLS_MAP = {
'HLS': 'm3u8_native',
'HLS_AES': 'm3u8',
}
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) mobj = re.match(self._VALID_URL, url)
@ -52,9 +56,9 @@ class CanvasIE(InfoExtractor):
format_url, format_type = target.get('url'), target.get('type') format_url, format_type = target.get('url'), target.get('type')
if not format_url or not format_type: if not format_url or not format_type:
continue continue
if format_type == 'HLS': if format_type in self._HLS_ENTRY_PROTOCOLS_MAP:
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
format_url, video_id, 'mp4', entry_protocol='m3u8_native', format_url, video_id, 'mp4', self._HLS_ENTRY_PROTOCOLS_MAP[format_type],
m3u8_id=format_type, fatal=False)) m3u8_id=format_type, fatal=False))
elif format_type == 'HDS': elif format_type == 'HDS':
formats.extend(self._extract_f4m_formats( formats.extend(self._extract_f4m_formats(

View File

@ -1,20 +1,19 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .turner import TurnerBaseIE from .turner import TurnerBaseIE
from ..utils import int_or_none
class CartoonNetworkIE(TurnerBaseIE): class CartoonNetworkIE(TurnerBaseIE):
_VALID_URL = r'https?://(?:www\.)?cartoonnetwork\.com/video/(?:[^/]+/)+(?P<id>[^/?#]+)-(?:clip|episode)\.html' _VALID_URL = r'https?://(?:www\.)?cartoonnetwork\.com/video/(?:[^/]+/)+(?P<id>[^/?#]+)-(?:clip|episode)\.html'
_TEST = { _TEST = {
'url': 'http://www.cartoonnetwork.com/video/teen-titans-go/starfire-the-cat-lady-clip.html', 'url': 'https://www.cartoonnetwork.com/video/ben-10/how-to-draw-upgrade-episode.html',
'info_dict': { 'info_dict': {
'id': '8a250ab04ed07e6c014ef3f1e2f9016c', 'id': '6e3375097f63874ebccec7ef677c1c3845fa850e',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Starfire the Cat Lady', 'title': 'How to Draw Upgrade',
'description': 'Robin decides to become a cat so that Starfire will finally love him.', 'description': 'md5:2061d83776db7e8be4879684eefe8c0f',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
@ -25,18 +24,39 @@ class CartoonNetworkIE(TurnerBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
id_type, video_id = re.search(r"_cnglobal\.cvp(Video|Title)Id\s*=\s*'([^']+)';", webpage).groups()
query = ('id' if id_type == 'Video' else 'titleId') + '=' + video_id def find_field(global_re, name, content_re=None, value_re='[^"]+', fatal=False):
return self._extract_cvp_info( metadata_re = ''
'http://www.cartoonnetwork.com/video-seo-svc/episodeservices/getCvpPlaylist?networkName=CN2&' + query, video_id, { if content_re:
'secure': { metadata_re = r'|video_metadata\.content_' + content_re
'media_src': 'http://androidhls-secure.cdn.turner.com/toon/big', return self._search_regex(
'tokenizer_src': 'https://token.vgtf.net/token/token_mobile', r'(?:_cnglobal\.currentVideo\.%s%s)\s*=\s*"(%s)";' % (global_re, metadata_re, value_re),
}, webpage, name, fatal=fatal)
}, {
media_id = find_field('mediaId', 'media id', 'id', '[0-9a-f]{40}', True)
title = find_field('episodeTitle', 'title', '(?:episodeName|name)', fatal=True)
info = self._extract_ngtv_info(
media_id, {'networkId': 'cartoonnetwork'}, {
'url': url, 'url': url,
'site_name': 'CartoonNetwork', 'site_name': 'CartoonNetwork',
'auth_required': self._search_regex( 'auth_required': find_field('authType', 'auth type') != 'unauth',
r'_cnglobal\.cvpFullOrPreviewAuth\s*=\s*(true|false);',
webpage, 'auth required', default='false') == 'true',
}) })
series = find_field(
'propertyName', 'series', 'showName') or self._html_search_meta('partOfSeries', webpage)
info.update({
'id': media_id,
'display_id': display_id,
'title': title,
'description': self._html_search_meta('description', webpage),
'series': series,
'episode': title,
})
for field in ('season', 'episode'):
field_name = field + 'Number'
info[field + '_number'] = int_or_none(find_field(
field_name, field + ' number', value_re=r'\d+') or self._html_search_meta(field_name, webpage))
return info

View File

@ -360,7 +360,7 @@ class CBCWatchVideoIE(CBCWatchBaseIE):
class CBCWatchIE(CBCWatchBaseIE): class CBCWatchIE(CBCWatchBaseIE):
IE_NAME = 'cbc.ca:watch' IE_NAME = 'cbc.ca:watch'
_VALID_URL = r'https?://watch\.cbc\.ca/(?:[^/]+/)+(?P<id>[0-9a-f-]+)' _VALID_URL = r'https?://(?:gem|watch)\.cbc\.ca/(?:[^/]+/)+(?P<id>[0-9a-f-]+)'
_TESTS = [{ _TESTS = [{
# geo-restricted to Canada, bypassable # geo-restricted to Canada, bypassable
'url': 'http://watch.cbc.ca/doc-zone/season-6/customer-disservice/38e815a-009e3ab12e4', 'url': 'http://watch.cbc.ca/doc-zone/season-6/customer-disservice/38e815a-009e3ab12e4',
@ -386,6 +386,9 @@ class CBCWatchIE(CBCWatchBaseIE):
'description': 'Arthur, the sweetest 8-year-old aardvark, and his pals solve all kinds of problems with humour, kindness and teamwork.', 'description': 'Arthur, the sweetest 8-year-old aardvark, and his pals solve all kinds of problems with humour, kindness and teamwork.',
}, },
'playlist_mincount': 30, 'playlist_mincount': 30,
}, {
'url': 'https://gem.cbc.ca/media/this-hour-has-22-minutes/season-26/episode-20/38e815a-0108c6c6a42',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -13,13 +13,17 @@ from ..utils import (
class CBSBaseIE(ThePlatformFeedIE): class CBSBaseIE(ThePlatformFeedIE):
def _parse_smil_subtitles(self, smil, namespace=None, subtitles_lang='en'): def _parse_smil_subtitles(self, smil, namespace=None, subtitles_lang='en'):
closed_caption_e = find_xpath_attr(smil, self._xpath_ns('.//param', namespace), 'name', 'ClosedCaptionURL') subtitles = {}
return { for k, ext in [('sMPTE-TTCCURL', 'tt'), ('ClosedCaptionURL', 'ttml'), ('webVTTCaptionURL', 'vtt')]:
'en': [{ cc_e = find_xpath_attr(smil, self._xpath_ns('.//param', namespace), 'name', k)
'ext': 'ttml', if cc_e is not None:
'url': closed_caption_e.attrib['value'], cc_url = cc_e.get('value')
}] if cc_url:
} if closed_caption_e is not None and closed_caption_e.attrib.get('value') else [] subtitles.setdefault(subtitles_lang, []).append({
'ext': ext,
'url': cc_url,
})
return subtitles
class CBSIE(CBSBaseIE): class CBSIE(CBSBaseIE):
@ -65,7 +69,7 @@ class CBSIE(CBSBaseIE):
last_e = None last_e = None
for item in items_data.findall('.//item'): for item in items_data.findall('.//item'):
asset_type = xpath_text(item, 'assetType') asset_type = xpath_text(item, 'assetType')
if not asset_type or asset_type in asset_types or asset_type in ('HLS_FPS', 'DASH_CENC'): if not asset_type or asset_type in asset_types or 'HLS_FPS' in asset_type or 'DASH_CENC' in asset_type:
continue continue
asset_types.append(asset_type) asset_types.append(asset_type)
query = { query = {

View File

@ -1,40 +1,62 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re
import zlib
from .common import InfoExtractor from .common import InfoExtractor
from .cbs import CBSIE from .cbs import CBSIE
from ..compat import (
compat_b64decode,
compat_urllib_parse_unquote,
)
from ..utils import ( from ..utils import (
parse_duration, parse_duration,
) )
class CBSNewsEmbedIE(CBSIE):
IE_NAME = 'cbsnews:embed'
_VALID_URL = r'https?://(?:www\.)?cbsnews\.com/embed/video[^#]*#(?P<id>.+)'
_TESTS = [{
'url': 'https://www.cbsnews.com/embed/video/?v=1.c9b5b61492913d6660db0b2f03579ef25e86307a#1Vb7b9s2EP5XBAHbT6Gt98PAMKTJ0se6LVjWYWtdGBR1stlIpEBSTtwi%2F%2FvuJNkNhmHdGxgM2NL57vjd6zt%2B8PngdN%2Fyg79qeGvhzN%2FLGrS%2F%2BuBLB531V28%2B%2BO7Qg7%2Fy97r2z3xZ42NW8yLhDbA0S0KWlHnIijwKWJBHZZnHBa8Cgbpdf%2F89NM9Hi9fXifhpr8sr%2FlP848tn%2BTdXycX25zh4cdX%2FvHl6PmmPqnWQv9w8Ed%2B9GjYRim07bFEqdG%2BZVHuwTm65A7bVRrYtR5lAyMox7pigF6W4k%2By91mjspGsJ%2BwVae4%2BsvdnaO1p73HkXs%2FVisUDTGm7R8IcdnOROeq%2B19qT1amhA1VJtPenoTUgrtfKc9m7Rq8dP7nnjwOB7wg7ADdNt7VX64DWAWlKhPtmDEq22g4GF99x6Dk9E8OSsankHXqPNKDxC%2FdK7MLKTircTDgsI3mmj4OBdSq64dy7fd1x577RU1rt4cvMtOaulFYOd%2FLewRWvDO9lIgXFpZSnkZmjbv5SxKTPoQXClFbpsf%2Fhbbpzs0IB3vb8KkyzJQ%2BywOAgCrMpgRrz%2BKk4fvb7kFbR4XJCu0gAdtNO7woCwZTu%2BBUs9bam%2Fds71drVerpeisgrubLjAB4nnOSkWQnfr5W6o1ku5Xpr1MgrCbL0M0vUyDtfLLK15WiYp47xKWSLyjFVpwVmVJSLIoCjSOFkv3W7oKsVliwZJcB9nwXpZ5GEQQwY8jNKqKCBrgjTLeFxgdCIpazojDgnRtn43J6kG7nZ6cAbxh0EeFFk4%2B1u867cY5u4344n%2FxXjCqAjucdTHgLKojNKmSfO8KRsOFY%2FzKEYCKEJBzv90QA9nfm9gL%2BHulaFqUkz9ULUYxl62B3U%2FRVNLA8IhggaPycOoBuwOCESciDQVSSUgiOMsROB%2FhKfwCKOzEk%2B4k6rWd4uuT%2FwTDz7K7t3d3WLO8ISD95jSPQbayBacthbz86XVgxHwhex5zawzgDOmtp%2F3GPcXn0VXHdSS029%2Fj99UC%2FwJUvyKQ%2FzKyixIEVlYJOn4RxxuaH43Ty9fbJ5OObykHH435XAzJTHeOF4hhEUXD8URe%2FQ%2FBT%2BMpf8d5GN02Ox%2FfiGsl7TA7POu1xZ5%2BbTzcAVKMe48mqcC21hkacVEVScM26liVVBnrKkC4CLKyzAvHu0lhEaTKMFwI3a4SN9MsrfYzdBLq2vkwRD1gVviLT8kY9h2CHH6Y%2Bix6609weFtey4ESp60WtyeWMy%2BsmBuhsoKIyuoT%2Bq2R%2FrW5qi3g%2FvzS2j40DoixDP8%2BKP0yUdpXJ4l6Vla%2Bg9vce%2BC4yM5YlUcbA%2F0jLKdpmTwvsdN5z88nAIe08%2F0HgxeG1iv%2B6Hlhjh7uiW0SDzYNI92L401uha3JKYk268UVRzdOzNQvAaJqoXzAc80dAV440NZ1WVVAAMRYQ2KrGJFmDUsq8saWSnjvIj8t78y%2FRa3JRnbHVfyFpfwoDiGpPgjzekyUiKNlU3OMlwuLMmzgvEojllYVE2Z1HhImvsnk%2BuhusTEoB21PAtSFodeFK3iYhXEH9WOG2%2FkOE833sfeG%2Ff5cfHtEFNXgYes0%2FXj7aGivUgJ9XpusCtoNcNYVVnJVrrDo0OmJAutHCpuZul4W9lLcfy7BnuLPT02%2ByXsCTk%2B9zhzswIN04YueNSK%2BPtM0jS88QdLqSLJDTLsuGZJNolm2yO0PXh3UPnz9Ix5bfIAqxPjvETQsDCEiPG4QbqNyhBZISxybLnZYCrW5H3Axp690%2F0BJdXtDZ5ITuM4xj3f4oUHGzc5JeJmZKpp%2FjwKh4wMV%2FV1yx3emLoR0MwbG4K%2F%2BZgVep3PnzXGDHZ6a3i%2Fk%2BJrONDN13%2Bnq6tBTYk4o7cLGhBtqCC4KwacGHpEVuoH5JNro%2FE6JfE6d5RydbiR76k%2BW5wioDHBIjw1euhHjUGRB0y5A97KoaPx6MlL%2BwgboUVtUFRI%2FLemgTpdtF59ii7pab08kuPcfWzs0l%2FRI5takWnFpka0zOgWRtYcuf9aIxZMxlwr6IiGpsb6j2DQUXPl%2FimXI599Ev7fWjoPD78A',
'only_matching': True,
}]
def _real_extract(self, url):
item = self._parse_json(zlib.decompress(compat_b64decode(
compat_urllib_parse_unquote(self._match_id(url))),
-zlib.MAX_WBITS), None)['video']['items'][0]
return self._extract_video_info(item['mpxRefId'], 'cbsnews')
class CBSNewsIE(CBSIE): class CBSNewsIE(CBSIE):
IE_NAME = 'cbsnews' IE_NAME = 'cbsnews'
IE_DESC = 'CBS News' IE_DESC = 'CBS News'
_VALID_URL = r'https?://(?:www\.)?cbsnews\.com/(?:news|videos)/(?P<id>[\da-z_-]+)' _VALID_URL = r'https?://(?:www\.)?cbsnews\.com/(?:news|video)/(?P<id>[\da-z_-]+)'
_TESTS = [ _TESTS = [
{ {
# 60 minutes # 60 minutes
'url': 'http://www.cbsnews.com/news/artificial-intelligence-positioned-to-be-a-game-changer/', 'url': 'http://www.cbsnews.com/news/artificial-intelligence-positioned-to-be-a-game-changer/',
'info_dict': { 'info_dict': {
'id': '_B6Ga3VJrI4iQNKsir_cdFo9Re_YJHE_', 'id': 'Y_nf_aEg6WwO9OLAq0MpKaPgfnBUxfW4',
'ext': 'mp4', 'ext': 'flv',
'title': 'Artificial Intelligence', 'title': 'Artificial Intelligence, real-life applications',
'description': 'md5:8818145f9974431e0fb58a1b8d69613c', 'description': 'md5:a7aaf27f1b4777244de8b0b442289304',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 1606, 'duration': 317,
'uploader': 'CBSI-NEW', 'uploader': 'CBSI-NEW',
'timestamp': 1498431900, 'timestamp': 1476046464,
'upload_date': '20170625', 'upload_date': '20161009',
}, },
'params': { 'params': {
# m3u8 download # rtmp download
'skip_download': True, 'skip_download': True,
}, },
}, },
{ {
'url': 'http://www.cbsnews.com/videos/fort-hood-shooting-army-downplays-mental-illness-as-cause-of-attack/', 'url': 'https://www.cbsnews.com/video/fort-hood-shooting-army-downplays-mental-illness-as-cause-of-attack/',
'info_dict': { 'info_dict': {
'id': 'SNJBOYzXiWBOvaLsdzwH8fmtP1SCd91Y', 'id': 'SNJBOYzXiWBOvaLsdzwH8fmtP1SCd91Y',
'ext': 'mp4', 'ext': 'mp4',
@ -60,37 +82,29 @@ class CBSNewsIE(CBSIE):
# 48 hours # 48 hours
'url': 'http://www.cbsnews.com/news/maria-ridulph-murder-will-the-nations-oldest-cold-case-to-go-to-trial-ever-get-solved/', 'url': 'http://www.cbsnews.com/news/maria-ridulph-murder-will-the-nations-oldest-cold-case-to-go-to-trial-ever-get-solved/',
'info_dict': { 'info_dict': {
'id': 'QpM5BJjBVEAUFi7ydR9LusS69DPLqPJ1',
'ext': 'mp4',
'title': 'Cold as Ice', 'title': 'Cold as Ice',
'description': 'Can a childhood memory of a friend\'s murder solve a 1957 cold case? "48 Hours" correspondent Erin Moriarty has the latest.', 'description': 'Can a childhood memory solve the 1957 murder of 7-year-old Maria Ridulph?',
'upload_date': '20170604',
'timestamp': 1496538000,
'uploader': 'CBSI-NEW',
},
'params': {
'skip_download': True,
}, },
'playlist_mincount': 7,
}, },
] ]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) display_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, display_id)
video_info = self._parse_json(self._html_search_regex( entries = []
r'(?:<ul class="media-list items" id="media-related-items"[^>]*><li data-video-info|<div id="cbsNewsVideoPlayer" data-video-player-options)=\'({.+?})\'', for embed_url in re.findall(r'<iframe[^>]+data-src="(https?://(?:www\.)?cbsnews\.com/embed/video/[^#]*#[^"]+)"', webpage):
webpage, 'video JSON info', default='{}'), video_id, fatal=False) entries.append(self.url_result(embed_url, CBSNewsEmbedIE.ie_key()))
if entries:
if video_info: return self.playlist_result(
item = video_info['item'] if 'item' in video_info else video_info entries, playlist_title=self._html_search_meta(['og:title', 'twitter:title'], webpage),
else: playlist_description=self._html_search_meta(['og:description', 'twitter:description', 'description'], webpage))
state = self._parse_json(self._search_regex(
r'data-cbsvideoui-options=(["\'])(?P<json>{.+?})\1', webpage,
'playlist JSON info', group='json'), video_id)['state']
item = state['playlist'][state['pid']]
item = self._parse_json(self._html_search_regex(
r'CBSNEWS\.defaultPayload\s*=\s*({.+})',
webpage, 'video JSON info'), display_id)['items'][0]
return self._extract_video_info(item['mpxRefId'], 'cbsnews') return self._extract_video_info(item['mpxRefId'], 'cbsnews')

View File

@ -1,9 +1,12 @@
# coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
int_or_none, int_or_none,
parse_iso8601, parse_iso8601,
try_get,
url_or_none,
) )
@ -18,11 +21,13 @@ class CCCIE(InfoExtractor):
'id': '1839', 'id': '1839',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Introduction to Processor Design', 'title': 'Introduction to Processor Design',
'creator': 'byterazor',
'description': 'md5:df55f6d073d4ceae55aae6f2fd98a0ac', 'description': 'md5:df55f6d073d4ceae55aae6f2fd98a0ac',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20131228', 'upload_date': '20131228',
'timestamp': 1388188800, 'timestamp': 1388188800,
'duration': 3710, 'duration': 3710,
'tags': list,
} }
}, { }, {
'url': 'https://media.ccc.de/v/32c3-7368-shopshifting#download', 'url': 'https://media.ccc.de/v/32c3-7368-shopshifting#download',
@ -68,6 +73,7 @@ class CCCIE(InfoExtractor):
'id': event_id, 'id': event_id,
'display_id': display_id, 'display_id': display_id,
'title': event_data['title'], 'title': event_data['title'],
'creator': try_get(event_data, lambda x: ', '.join(x['persons'])),
'description': event_data.get('description'), 'description': event_data.get('description'),
'thumbnail': event_data.get('thumb_url'), 'thumbnail': event_data.get('thumb_url'),
'timestamp': parse_iso8601(event_data.get('date')), 'timestamp': parse_iso8601(event_data.get('date')),
@ -75,3 +81,31 @@ class CCCIE(InfoExtractor):
'tags': event_data.get('tags'), 'tags': event_data.get('tags'),
'formats': formats, 'formats': formats,
} }
class CCCPlaylistIE(InfoExtractor):
IE_NAME = 'media.ccc.de:lists'
_VALID_URL = r'https?://(?:www\.)?media\.ccc\.de/c/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://media.ccc.de/c/30c3',
'info_dict': {
'title': '30C3',
'id': '30c3',
},
'playlist_count': 135,
}]
def _real_extract(self, url):
playlist_id = self._match_id(url).lower()
conf = self._download_json(
'https://media.ccc.de/public/conferences/' + playlist_id,
playlist_id)
entries = []
for e in conf['events']:
event_url = url_or_none(e.get('frontend_link'))
if event_url:
entries.append(self.url_result(event_url, ie=CCCIE.ie_key()))
return self.playlist_result(entries, playlist_id, conf.get('title'))

View File

@ -147,6 +147,8 @@ class CeskaTelevizeIE(InfoExtractor):
is_live = item.get('type') == 'LIVE' is_live = item.get('type') == 'LIVE'
formats = [] formats = []
for format_id, stream_url in item.get('streamUrls', {}).items(): for format_id, stream_url in item.get('streamUrls', {}).items():
if 'drmOnly=true' in stream_url:
continue
if 'playerType=flash' in stream_url: if 'playerType=flash' in stream_url:
stream_formats = self._extract_m3u8_formats( stream_formats = self._extract_m3u8_formats(
stream_url, playlist_id, 'mp4', 'm3u8_native', stream_url, playlist_id, 'mp4', 'm3u8_native',
@ -155,7 +157,7 @@ class CeskaTelevizeIE(InfoExtractor):
stream_formats = self._extract_mpd_formats( stream_formats = self._extract_mpd_formats(
stream_url, playlist_id, stream_url, playlist_id,
mpd_id='dash-%s' % format_id, fatal=False) mpd_id='dash-%s' % format_id, fatal=False)
# See https://github.com/rg3/youtube-dl/issues/12119#issuecomment-280037031 # See https://github.com/ytdl-org/youtube-dl/issues/12119#issuecomment-280037031
if format_id == 'audioDescription': if format_id == 'audioDescription':
for f in stream_formats: for f in stream_formats:
f['source_preference'] = -10 f['source_preference'] = -10

View File

@ -32,7 +32,7 @@ class Channel9IE(InfoExtractor):
'upload_date': '20130828', 'upload_date': '20130828',
'session_code': 'KOS002', 'session_code': 'KOS002',
'session_room': 'Arena 1A', 'session_room': 'Arena 1A',
'session_speakers': ['Andrew Coates', 'Brady Gaster', 'Mads Kristensen', 'Ed Blankenship', 'Patrick Klug'], 'session_speakers': 'count:5',
}, },
}, { }, {
'url': 'http://channel9.msdn.com/posts/Self-service-BI-with-Power-BI-nuclear-testing', 'url': 'http://channel9.msdn.com/posts/Self-service-BI-with-Power-BI-nuclear-testing',
@ -64,15 +64,15 @@ class Channel9IE(InfoExtractor):
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
}, {
'url': 'https://channel9.msdn.com/Niners/Splendid22/Queue/76acff796e8f411184b008028e0d492b/RSS',
'info_dict': {
'id': 'Niners/Splendid22/Queue/76acff796e8f411184b008028e0d492b',
'title': 'Channel 9',
},
'playlist_mincount': 100,
}, { }, {
'url': 'https://channel9.msdn.com/Events/DEVintersection/DEVintersection-2016/RSS', 'url': 'https://channel9.msdn.com/Events/DEVintersection/DEVintersection-2016/RSS',
'info_dict': {
'id': 'Events/DEVintersection/DEVintersection-2016',
'title': 'DEVintersection 2016 Orlando Sessions',
},
'playlist_mincount': 14,
}, {
'url': 'https://channel9.msdn.com/Niners/Splendid22/Queue/76acff796e8f411184b008028e0d492b/RSS',
'only_matching': True, 'only_matching': True,
}, { }, {
'url': 'https://channel9.msdn.com/Events/Speakers/scott-hanselman/RSS?UrlSafeName=scott-hanselman', 'url': 'https://channel9.msdn.com/Events/Speakers/scott-hanselman/RSS?UrlSafeName=scott-hanselman',
@ -112,11 +112,11 @@ class Channel9IE(InfoExtractor):
episode_data), content_path) episode_data), content_path)
content_id = episode_data['contentId'] content_id = episode_data['contentId']
is_session = '/Sessions(' in episode_data['api'] is_session = '/Sessions(' in episode_data['api']
content_url = 'https://channel9.msdn.com/odata' + episode_data['api'] content_url = 'https://channel9.msdn.com/odata' + episode_data['api'] + '?$select=Captions,CommentCount,MediaLengthInSeconds,PublishedDate,Rating,RatingCount,Title,VideoMP4High,VideoMP4Low,VideoMP4Medium,VideoPlayerPreviewImage,VideoWMV,VideoWMVHQ,Views,'
if is_session: if is_session:
content_url += '?$expand=Speakers' content_url += 'Code,Description,Room,Slides,Speakers,ZipFile&$expand=Speakers'
else: else:
content_url += '?$expand=Authors' content_url += 'Authors,Body&$expand=Authors'
content_data = self._download_json(content_url, content_id) content_data = self._download_json(content_url, content_id)
title = content_data['Title'] title = content_data['Title']
@ -210,7 +210,7 @@ class Channel9IE(InfoExtractor):
'id': content_id, 'id': content_id,
'title': title, 'title': title,
'description': clean_html(content_data.get('Description') or content_data.get('Body')), 'description': clean_html(content_data.get('Description') or content_data.get('Body')),
'thumbnail': content_data.get('Thumbnail') or content_data.get('VideoPlayerPreviewImage'), 'thumbnail': content_data.get('VideoPlayerPreviewImage'),
'duration': int_or_none(content_data.get('MediaLengthInSeconds')), 'duration': int_or_none(content_data.get('MediaLengthInSeconds')),
'timestamp': parse_iso8601(content_data.get('PublishedDate')), 'timestamp': parse_iso8601(content_data.get('PublishedDate')),
'avg_rating': int_or_none(content_data.get('Rating')), 'avg_rating': int_or_none(content_data.get('Rating')),

View File

@ -3,11 +3,15 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ExtractorError from ..utils import (
ExtractorError,
lowercase_escape,
url_or_none,
)
class ChaturbateIE(InfoExtractor): class ChaturbateIE(InfoExtractor):
_VALID_URL = r'https?://(?:[^/]+\.)?chaturbate\.com/(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:[^/]+\.)?chaturbate\.com/(?:fullvideo/?\?.*?\bb=)?(?P<id>[^/?&#]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.chaturbate.com/siswet19/', 'url': 'https://www.chaturbate.com/siswet19/',
'info_dict': { 'info_dict': {
@ -21,6 +25,9 @@ class ChaturbateIE(InfoExtractor):
'skip_download': True, 'skip_download': True,
}, },
'skip': 'Room is offline', 'skip': 'Room is offline',
}, {
'url': 'https://chaturbate.com/fullvideo/?b=caylin',
'only_matching': True,
}, { }, {
'url': 'https://en.chaturbate.com/siswet19/', 'url': 'https://en.chaturbate.com/siswet19/',
'only_matching': True, 'only_matching': True,
@ -32,14 +39,34 @@ class ChaturbateIE(InfoExtractor):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage( webpage = self._download_webpage(
url, video_id, headers=self.geo_verification_headers()) 'https://chaturbate.com/%s/' % video_id, video_id,
headers=self.geo_verification_headers())
m3u8_urls = [] found_m3u8_urls = []
data = self._parse_json(
self._search_regex(
r'initialRoomDossier\s*=\s*(["\'])(?P<value>(?:(?!\1).)+)\1',
webpage, 'data', default='{}', group='value'),
video_id, transform_source=lowercase_escape, fatal=False)
if data:
m3u8_url = url_or_none(data.get('hls_source'))
if m3u8_url:
found_m3u8_urls.append(m3u8_url)
if not found_m3u8_urls:
for m in re.finditer(
r'(\\u002[27])(?P<url>http.+?\.m3u8.*?)\1', webpage):
found_m3u8_urls.append(lowercase_escape(m.group('url')))
if not found_m3u8_urls:
for m in re.finditer( for m in re.finditer(
r'(["\'])(?P<url>http.+?\.m3u8.*?)\1', webpage): r'(["\'])(?P<url>http.+?\.m3u8.*?)\1', webpage):
m3u8_fast_url, m3u8_no_fast_url = m.group('url'), m.group( found_m3u8_urls.append(m.group('url'))
'url').replace('_fast', '')
m3u8_urls = []
for found_m3u8_url in found_m3u8_urls:
m3u8_fast_url, m3u8_no_fast_url = found_m3u8_url, found_m3u8_url.replace('_fast', '')
for m3u8_url in (m3u8_fast_url, m3u8_no_fast_url): for m3u8_url in (m3u8_fast_url, m3u8_no_fast_url):
if m3u8_url not in m3u8_urls: if m3u8_url not in m3u8_urls:
m3u8_urls.append(m3u8_url) m3u8_urls.append(m3u8_url)
@ -59,7 +86,12 @@ class ChaturbateIE(InfoExtractor):
formats = [] formats = []
for m3u8_url in m3u8_urls: for m3u8_url in m3u8_urls:
m3u8_id = 'fast' if '_fast' in m3u8_url else 'slow' for known_id in ('fast', 'slow'):
if '_%s' % known_id in m3u8_url:
m3u8_id = known_id
break
else:
m3u8_id = None
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, ext='mp4', m3u8_url, video_id, ext='mp4',
# ffmpeg skips segments for fast m3u8 # ffmpeg skips segments for fast m3u8

View File

@ -0,0 +1,29 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .hbo import HBOBaseIE
class CinemaxIE(HBOBaseIE):
_VALID_URL = r'https?://(?:www\.)?cinemax\.com/(?P<path>[^/]+/video/[0-9a-z-]+-(?P<id>\d+))'
_TESTS = [{
'url': 'https://www.cinemax.com/warrior/video/s1-ep-1-recap-20126903',
'md5': '82e0734bba8aa7ef526c9dd00cf35a05',
'info_dict': {
'id': '20126903',
'ext': 'mp4',
'title': 'S1 Ep 1: Recap',
},
'expected_warnings': ['Unknown MIME type application/mp4 in DASH manifest'],
}, {
'url': 'https://www.cinemax.com/warrior/video/s1-ep-1-recap-20126903.embed',
'only_matching': True,
}]
def _real_extract(self, url):
path, video_id = re.match(self._VALID_URL, url).groups()
info = self._extract_info('https://www.cinemax.com/%s.xml' % path, video_id)
info['id'] = video_id
return info

View File

@ -65,8 +65,8 @@ class CiscoLiveBaseIE(InfoExtractor):
class CiscoLiveSessionIE(CiscoLiveBaseIE): class CiscoLiveSessionIE(CiscoLiveBaseIE):
_VALID_URL = r'https?://ciscolive\.cisco\.com/on-demand-library/\??[^#]*#/session/(?P<id>[^/?&]+)' _VALID_URL = r'https?://(?:www\.)?ciscolive(?:\.cisco)?\.com/[^#]*#/session/(?P<id>[^/?&]+)'
_TEST = { _TESTS = [{
'url': 'https://ciscolive.cisco.com/on-demand-library/?#/session/1423353499155001FoSs', 'url': 'https://ciscolive.cisco.com/on-demand-library/?#/session/1423353499155001FoSs',
'md5': 'c98acf395ed9c9f766941c70f5352e22', 'md5': 'c98acf395ed9c9f766941c70f5352e22',
'info_dict': { 'info_dict': {
@ -79,7 +79,13 @@ class CiscoLiveSessionIE(CiscoLiveBaseIE):
'uploader_id': '5647924234001', 'uploader_id': '5647924234001',
'location': '16B Mezz.', 'location': '16B Mezz.',
}, },
} }, {
'url': 'https://www.ciscolive.com/global/on-demand-library.html?search.event=ciscoliveemea2019#/session/15361595531500013WOU',
'only_matching': True,
}, {
'url': 'https://www.ciscolive.com/global/on-demand-library.html?#/session/1490051371645001kNaS',
'only_matching': True,
}]
def _real_extract(self, url): def _real_extract(self, url):
rf_id = self._match_id(url) rf_id = self._match_id(url)
@ -88,7 +94,7 @@ class CiscoLiveSessionIE(CiscoLiveBaseIE):
class CiscoLiveSearchIE(CiscoLiveBaseIE): class CiscoLiveSearchIE(CiscoLiveBaseIE):
_VALID_URL = r'https?://ciscolive\.cisco\.com/on-demand-library/' _VALID_URL = r'https?://(?:www\.)?ciscolive(?:\.cisco)?\.com/(?:global/)?on-demand-library(?:\.html|/)'
_TESTS = [{ _TESTS = [{
'url': 'https://ciscolive.cisco.com/on-demand-library/?search.event=ciscoliveus2018&search.technicallevel=scpsSkillLevel_aintroductory&search.focus=scpsSessionFocus_designAndDeployment#/', 'url': 'https://ciscolive.cisco.com/on-demand-library/?search.event=ciscoliveus2018&search.technicallevel=scpsSkillLevel_aintroductory&search.focus=scpsSessionFocus_designAndDeployment#/',
'info_dict': { 'info_dict': {
@ -98,6 +104,9 @@ class CiscoLiveSearchIE(CiscoLiveBaseIE):
}, { }, {
'url': 'https://ciscolive.cisco.com/on-demand-library/?search.technology=scpsTechnology_applicationDevelopment&search.technology=scpsTechnology_ipv6&search.focus=scpsSessionFocus_troubleshootingTroubleshooting#/', 'url': 'https://ciscolive.cisco.com/on-demand-library/?search.technology=scpsTechnology_applicationDevelopment&search.technology=scpsTechnology_ipv6&search.focus=scpsSessionFocus_troubleshootingTroubleshooting#/',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.ciscolive.com/global/on-demand-library.html?search.technicallevel=scpsSkillLevel_aintroductory&search.event=ciscoliveemea2019&search.technology=scpsTechnology_dataCenter&search.focus=scpsSessionFocus_bestPractices#/',
'only_matching': True,
}] }]
@classmethod @classmethod

View File

@ -1,20 +1,24 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import base64
import re import re
from .common import InfoExtractor from .common import InfoExtractor
class CloudflareStreamIE(InfoExtractor): class CloudflareStreamIE(InfoExtractor):
_DOMAIN_RE = r'(?:cloudflarestream\.com|(?:videodelivery|bytehighway)\.net)'
_EMBED_RE = r'embed\.%s/embed/[^/]+\.js\?.*?\bvideo=' % _DOMAIN_RE
_ID_RE = r'[\da-f]{32}|[\w-]+\.[\w-]+\.[\w-]+'
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https?:// https?://
(?: (?:
(?:watch\.)?cloudflarestream\.com/| (?:watch\.)?%s/|
embed\.cloudflarestream\.com/embed/[^/]+\.js\?.*?\bvideo= %s
) )
(?P<id>[\da-f]+) (?P<id>%s)
''' ''' % (_DOMAIN_RE, _EMBED_RE, _ID_RE)
_TESTS = [{ _TESTS = [{
'url': 'https://embed.cloudflarestream.com/embed/we4g.fla9.latest.js?video=31c9291ab41fac05471db4e73aa11717', 'url': 'https://embed.cloudflarestream.com/embed/we4g.fla9.latest.js?video=31c9291ab41fac05471db4e73aa11717',
'info_dict': { 'info_dict': {
@ -31,6 +35,9 @@ class CloudflareStreamIE(InfoExtractor):
}, { }, {
'url': 'https://cloudflarestream.com/31c9291ab41fac05471db4e73aa11717/manifest/video.mpd', 'url': 'https://cloudflarestream.com/31c9291ab41fac05471db4e73aa11717/manifest/video.mpd',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://embed.videodelivery.net/embed/r4xu.fla9.latest.js?video=81d80727f3022488598f68d323c1ad5e',
'only_matching': True,
}] }]
@staticmethod @staticmethod
@ -38,23 +45,28 @@ class CloudflareStreamIE(InfoExtractor):
return [ return [
mobj.group('url') mobj.group('url')
for mobj in re.finditer( for mobj in re.finditer(
r'<script[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//embed\.cloudflarestream\.com/embed/[^/]+\.js\?.*?\bvideo=[\da-f]+?.*?)\1', r'<script[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//%s(?:%s).*?)\1' % (CloudflareStreamIE._EMBED_RE, CloudflareStreamIE._ID_RE),
webpage)] webpage)]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
domain = 'bytehighway.net' if 'bytehighway.net/' in url else 'videodelivery.net'
base_url = 'https://%s/%s/' % (domain, video_id)
if '.' in video_id:
video_id = self._parse_json(base64.urlsafe_b64decode(
video_id.split('.')[1]), video_id)['sub']
manifest_base_url = base_url + 'manifest/video.'
formats = self._extract_m3u8_formats( formats = self._extract_m3u8_formats(
'https://cloudflarestream.com/%s/manifest/video.m3u8' % video_id, manifest_base_url + 'm3u8', video_id, 'mp4',
video_id, 'mp4', entry_protocol='m3u8_native', m3u8_id='hls', 'm3u8_native', m3u8_id='hls', fatal=False)
fatal=False)
formats.extend(self._extract_mpd_formats( formats.extend(self._extract_mpd_formats(
'https://cloudflarestream.com/%s/manifest/video.mpd' % video_id, manifest_base_url + 'mpd', video_id, mpd_id='dash', fatal=False))
video_id, mpd_id='dash', fatal=False))
self._sort_formats(formats) self._sort_formats(formats)
return { return {
'id': video_id, 'id': video_id,
'title': video_id, 'title': video_id,
'thumbnail': base_url + 'thumbnails/thumbnail.jpg',
'formats': formats, 'formats': formats,
} }

View File

@ -1,74 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
int_or_none,
parse_duration,
parse_iso8601,
)
class ComCarCoffIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?comediansincarsgettingcoffee\.com/(?P<id>[a-z0-9\-]*)'
_TESTS = [{
'url': 'http://comediansincarsgettingcoffee.com/miranda-sings-happy-thanksgiving-miranda/',
'info_dict': {
'id': '2494164',
'ext': 'mp4',
'upload_date': '20141127',
'timestamp': 1417107600,
'duration': 1232,
'title': 'Happy Thanksgiving Miranda',
'description': 'Jerry Seinfeld and his special guest Miranda Sings cruise around town in search of coffee, complaining and apologizing along the way.',
},
'params': {
'skip_download': 'requires ffmpeg',
}
}]
def _real_extract(self, url):
display_id = self._match_id(url)
if not display_id:
display_id = 'comediansincarsgettingcoffee.com'
webpage = self._download_webpage(url, display_id)
full_data = self._parse_json(
self._search_regex(
r'window\.app\s*=\s*({.+?});\n', webpage, 'full data json'),
display_id)['videoData']
display_id = full_data['activeVideo']['video']
video_data = full_data.get('videos', {}).get(display_id) or full_data['singleshots'][display_id]
video_id = compat_str(video_data['mediaId'])
title = video_data['title']
formats = self._extract_m3u8_formats(
video_data['mediaUrl'], video_id, 'mp4')
self._sort_formats(formats)
thumbnails = [{
'url': video_data['images']['thumb'],
}, {
'url': video_data['images']['poster'],
}]
timestamp = int_or_none(video_data.get('pubDateTime')) or parse_iso8601(
video_data.get('pubDate'))
duration = int_or_none(video_data.get('durationSeconds')) or parse_duration(
video_data.get('duration'))
return {
'id': video_id,
'display_id': display_id,
'title': title,
'description': video_data.get('description'),
'timestamp': timestamp,
'duration': duration,
'thumbnails': thumbnails,
'formats': formats,
'season_number': int_or_none(video_data.get('season')),
'episode_number': int_or_none(video_data.get('episode')),
'webpage_url': 'http://comediansincarsgettingcoffee.com/%s' % (video_data.get('urlSlug', video_data.get('slug'))),
}

View File

@ -17,6 +17,7 @@ import math
from ..compat import ( from ..compat import (
compat_cookiejar, compat_cookiejar,
compat_cookies, compat_cookies,
compat_etree_Element,
compat_etree_fromstring, compat_etree_fromstring,
compat_getpass, compat_getpass,
compat_integer_types, compat_integer_types,
@ -43,6 +44,7 @@ from ..utils import (
compiled_regex_type, compiled_regex_type,
determine_ext, determine_ext,
determine_protocol, determine_protocol,
dict_get,
error_to_compat_str, error_to_compat_str,
ExtractorError, ExtractorError,
extract_attributes, extract_attributes,
@ -55,13 +57,17 @@ from ..utils import (
JSON_LD_RE, JSON_LD_RE,
mimetype2ext, mimetype2ext,
orderedSet, orderedSet,
parse_bitrate,
parse_codecs, parse_codecs,
parse_duration, parse_duration,
parse_iso8601, parse_iso8601,
parse_m3u8_attributes, parse_m3u8_attributes,
parse_resolution,
RegexNotFoundError, RegexNotFoundError,
sanitized_Request, sanitized_Request,
sanitize_filename, sanitize_filename,
str_or_none,
strip_or_none,
unescapeHTML, unescapeHTML,
unified_strdate, unified_strdate,
unified_timestamp, unified_timestamp,
@ -102,10 +108,26 @@ class InfoExtractor(object):
from worst to best quality. from worst to best quality.
Potential fields: Potential fields:
* url Mandatory. The URL of the video file * url The mandatory URL representing the media:
for plain file media - HTTP URL of this file,
for RTMP - RTMP URL,
for HLS - URL of the M3U8 media playlist,
for HDS - URL of the F4M manifest,
for DASH
- HTTP URL to plain file media (in case of
unfragmented media)
- URL of the MPD manifest or base URL
representing the media if MPD manifest
is parsed from a string (in case of
fragmented media)
for MSS - URL of the ISM manifest.
* manifest_url * manifest_url
The URL of the manifest file in case of The URL of the manifest file in case of
fragmented media (DASH, hls, hds) fragmented media:
for HLS - URL of the M3U8 master playlist,
for HDS - URL of the F4M manifest,
for DASH - URL of the MPD manifest,
for MSS - URL of the ISM manifest.
* ext Will be calculated from URL if missing * ext Will be calculated from URL if missing
* format A human-readable description of the format * format A human-readable description of the format
("mp4 container with h264/opus"). ("mp4 container with h264/opus").
@ -198,7 +220,7 @@ class InfoExtractor(object):
* "preference" (optional, int) - quality of the image * "preference" (optional, int) - quality of the image
* "width" (optional, int) * "width" (optional, int)
* "height" (optional, int) * "height" (optional, int)
* "resolution" (optional, string "{width}x{height"}, * "resolution" (optional, string "{width}x{height}",
deprecated) deprecated)
* "filesize" (optional, int) * "filesize" (optional, int)
thumbnail: Full URL to a video thumbnail image. thumbnail: Full URL to a video thumbnail image.
@ -521,11 +543,11 @@ class InfoExtractor(object):
raise ExtractorError('An extractor error has occurred.', cause=e) raise ExtractorError('An extractor error has occurred.', cause=e)
def __maybe_fake_ip_and_retry(self, countries): def __maybe_fake_ip_and_retry(self, countries):
if (not self._downloader.params.get('geo_bypass_country', None) and if (not self._downloader.params.get('geo_bypass_country', None)
self._GEO_BYPASS and and self._GEO_BYPASS
self._downloader.params.get('geo_bypass', True) and and self._downloader.params.get('geo_bypass', True)
not self._x_forwarded_for_ip and and not self._x_forwarded_for_ip
countries): and countries):
country_code = random.choice(countries) country_code = random.choice(countries)
self._x_forwarded_for_ip = GeoUtils.random_ipv4(country_code) self._x_forwarded_for_ip = GeoUtils.random_ipv4(country_code)
if self._x_forwarded_for_ip: if self._x_forwarded_for_ip:
@ -661,8 +683,8 @@ class InfoExtractor(object):
def __check_blocked(self, content): def __check_blocked(self, content):
first_block = content[:512] first_block = content[:512]
if ('<title>Access to this site is blocked</title>' in content and if ('<title>Access to this site is blocked</title>' in content
'Websense' in first_block): and 'Websense' in first_block):
msg = 'Access to this webpage has been blocked by Websense filtering software in your network.' msg = 'Access to this webpage has been blocked by Websense filtering software in your network.'
blocked_iframe = self._html_search_regex( blocked_iframe = self._html_search_regex(
r'<iframe src="([^"]+)"', content, r'<iframe src="([^"]+)"', content,
@ -680,8 +702,8 @@ class InfoExtractor(object):
if block_msg: if block_msg:
msg += ' (Message: "%s")' % block_msg.replace('\n', ' ') msg += ' (Message: "%s")' % block_msg.replace('\n', ' ')
raise ExtractorError(msg, expected=True) raise ExtractorError(msg, expected=True)
if ('<title>TTK :: Доступ к ресурсу ограничен</title>' in content and if ('<title>TTK :: Доступ к ресурсу ограничен</title>' in content
'blocklist.rkn.gov.ru' in content): and 'blocklist.rkn.gov.ru' in content):
raise ExtractorError( raise ExtractorError(
'Access to this webpage has been blocked by decision of the Russian government. ' 'Access to this webpage has been blocked by decision of the Russian government. '
'Visit http://blocklist.rkn.gov.ru/ for a block reason.', 'Visit http://blocklist.rkn.gov.ru/ for a block reason.',
@ -788,7 +810,7 @@ class InfoExtractor(object):
fatal=True, encoding=None, data=None, headers={}, query={}, fatal=True, encoding=None, data=None, headers={}, query={},
expected_status=None): expected_status=None):
""" """
Return a tuple (xml as an xml.etree.ElementTree.Element, URL handle). Return a tuple (xml as an compat_etree_Element, URL handle).
See _download_webpage docstring for arguments specification. See _download_webpage docstring for arguments specification.
""" """
@ -809,7 +831,7 @@ class InfoExtractor(object):
transform_source=None, fatal=True, encoding=None, transform_source=None, fatal=True, encoding=None,
data=None, headers={}, query={}, expected_status=None): data=None, headers={}, query={}, expected_status=None):
""" """
Return the xml as an xml.etree.ElementTree.Element. Return the xml as an compat_etree_Element.
See _download_webpage docstring for arguments specification. See _download_webpage docstring for arguments specification.
""" """
@ -1058,7 +1080,7 @@ class InfoExtractor(object):
@staticmethod @staticmethod
def _og_regexes(prop): def _og_regexes(prop):
content_re = r'content=(?:"([^"]+?)"|\'([^\']+?)\'|\s*([^\s"\'=<>`]+?))' content_re = r'content=(?:"([^"]+?)"|\'([^\']+?)\'|\s*([^\s"\'=<>`]+?))'
property_re = (r'(?:name|property)=(?:\'og:%(prop)s\'|"og:%(prop)s"|\s*og:%(prop)s\b)' property_re = (r'(?:name|property)=(?:\'og[:-]%(prop)s\'|"og[:-]%(prop)s"|\s*og[:-]%(prop)s\b)'
% {'prop': re.escape(prop)}) % {'prop': re.escape(prop)})
template = r'<meta[^>]+?%s[^>]+?%s' template = r'<meta[^>]+?%s[^>]+?%s'
return [ return [
@ -1249,7 +1271,10 @@ class InfoExtractor(object):
info['title'] = episode_name info['title'] = episode_name
part_of_season = e.get('partOfSeason') part_of_season = e.get('partOfSeason')
if isinstance(part_of_season, dict) and part_of_season.get('@type') in ('TVSeason', 'Season', 'CreativeWorkSeason'): if isinstance(part_of_season, dict) and part_of_season.get('@type') in ('TVSeason', 'Season', 'CreativeWorkSeason'):
info['season_number'] = int_or_none(part_of_season.get('seasonNumber')) info.update({
'season': unescapeHTML(part_of_season.get('name')),
'season_number': int_or_none(part_of_season.get('seasonNumber')),
})
part_of_series = e.get('partOfSeries') or e.get('partOfTVSeries') part_of_series = e.get('partOfSeries') or e.get('partOfTVSeries')
if isinstance(part_of_series, dict) and part_of_series.get('@type') in ('TVSeries', 'Series', 'CreativeWorkSeries'): if isinstance(part_of_series, dict) and part_of_series.get('@type') in ('TVSeries', 'Series', 'CreativeWorkSeries'):
info['series'] = unescapeHTML(part_of_series.get('name')) info['series'] = unescapeHTML(part_of_series.get('name'))
@ -1399,12 +1424,10 @@ class InfoExtractor(object):
try: try:
self._request_webpage(url, video_id, 'Checking %s URL' % item, headers=headers) self._request_webpage(url, video_id, 'Checking %s URL' % item, headers=headers)
return True return True
except ExtractorError as e: except ExtractorError:
if isinstance(e.cause, compat_urllib_error.URLError):
self.to_screen( self.to_screen(
'%s: %s URL is invalid, skipping' % (video_id, item)) '%s: %s URL is invalid, skipping' % (video_id, item))
return False return False
raise
def http_scheme(self): def http_scheme(self):
""" Either "http:" or "https:", depending on the user's preferences """ """ Either "http:" or "https:", depending on the user's preferences """
@ -1432,14 +1455,14 @@ class InfoExtractor(object):
def _extract_f4m_formats(self, manifest_url, video_id, preference=None, f4m_id=None, def _extract_f4m_formats(self, manifest_url, video_id, preference=None, f4m_id=None,
transform_source=lambda s: fix_xml_ampersands(s).strip(), transform_source=lambda s: fix_xml_ampersands(s).strip(),
fatal=True, m3u8_id=None): fatal=True, m3u8_id=None, data=None, headers={}, query={}):
manifest = self._download_xml( manifest = self._download_xml(
manifest_url, video_id, 'Downloading f4m manifest', manifest_url, video_id, 'Downloading f4m manifest',
'Unable to download f4m manifest', 'Unable to download f4m manifest',
# Some manifests may be malformed, e.g. prosiebensat1 generated manifests # Some manifests may be malformed, e.g. prosiebensat1 generated manifests
# (see https://github.com/rg3/youtube-dl/issues/6215#issuecomment-121704244) # (see https://github.com/ytdl-org/youtube-dl/issues/6215#issuecomment-121704244)
transform_source=transform_source, transform_source=transform_source,
fatal=fatal) fatal=fatal, data=data, headers=headers, query=query)
if manifest is False: if manifest is False:
return [] return []
@ -1451,6 +1474,9 @@ class InfoExtractor(object):
def _parse_f4m_formats(self, manifest, manifest_url, video_id, preference=None, f4m_id=None, def _parse_f4m_formats(self, manifest, manifest_url, video_id, preference=None, f4m_id=None,
transform_source=lambda s: fix_xml_ampersands(s).strip(), transform_source=lambda s: fix_xml_ampersands(s).strip(),
fatal=True, m3u8_id=None): fatal=True, m3u8_id=None):
if not isinstance(manifest, compat_etree_Element) and not fatal:
return []
# currently youtube-dl cannot decode the playerVerificationChallenge as Akamai uses Adobe Alchemy # currently youtube-dl cannot decode the playerVerificationChallenge as Akamai uses Adobe Alchemy
akamai_pv = manifest.find('{http://ns.adobe.com/f4m/1.0}pv-2.0') akamai_pv = manifest.find('{http://ns.adobe.com/f4m/1.0}pv-2.0')
if akamai_pv is not None and ';' in akamai_pv.text: if akamai_pv is not None and ';' in akamai_pv.text:
@ -1465,7 +1491,7 @@ class InfoExtractor(object):
manifest_version = '2.0' manifest_version = '2.0'
media_nodes = manifest.findall('{http://ns.adobe.com/f4m/2.0}media') media_nodes = manifest.findall('{http://ns.adobe.com/f4m/2.0}media')
# Remove unsupported DRM protected media from final formats # Remove unsupported DRM protected media from final formats
# rendition (see https://github.com/rg3/youtube-dl/issues/8573). # rendition (see https://github.com/ytdl-org/youtube-dl/issues/8573).
media_nodes = remove_encrypted_media(media_nodes) media_nodes = remove_encrypted_media(media_nodes)
if not media_nodes: if not media_nodes:
return formats return formats
@ -1560,12 +1586,13 @@ class InfoExtractor(object):
def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None, def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
entry_protocol='m3u8', preference=None, entry_protocol='m3u8', preference=None,
m3u8_id=None, note=None, errnote=None, m3u8_id=None, note=None, errnote=None,
fatal=True, live=False): fatal=True, live=False, data=None, headers={},
query={}):
res = self._download_webpage_handle( res = self._download_webpage_handle(
m3u8_url, video_id, m3u8_url, video_id,
note=note or 'Downloading m3u8 information', note=note or 'Downloading m3u8 information',
errnote=errnote or 'Failed to download m3u8 information', errnote=errnote or 'Failed to download m3u8 information',
fatal=fatal) fatal=fatal, data=data, headers=headers, query=query)
if res is False: if res is False:
return [] return []
@ -1595,7 +1622,8 @@ class InfoExtractor(object):
# References: # References:
# 1. https://tools.ietf.org/html/draft-pantos-http-live-streaming-21 # 1. https://tools.ietf.org/html/draft-pantos-http-live-streaming-21
# 2. https://github.com/rg3/youtube-dl/issues/12211 # 2. https://github.com/ytdl-org/youtube-dl/issues/12211
# 3. https://github.com/ytdl-org/youtube-dl/issues/18923
# We should try extracting formats only from master playlists [1, 4.3.4], # We should try extracting formats only from master playlists [1, 4.3.4],
# i.e. playlists that describe available qualities. On the other hand # i.e. playlists that describe available qualities. On the other hand
@ -1667,17 +1695,22 @@ class InfoExtractor(object):
rendition = stream_group[0] rendition = stream_group[0]
return rendition.get('NAME') or stream_group_id return rendition.get('NAME') or stream_group_id
# parse EXT-X-MEDIA tags before EXT-X-STREAM-INF in order to have the
# chance to detect video only formats when EXT-X-STREAM-INF tags
# precede EXT-X-MEDIA tags in HLS manifest such as [3].
for line in m3u8_doc.splitlines():
if line.startswith('#EXT-X-MEDIA:'):
extract_media(line)
for line in m3u8_doc.splitlines(): for line in m3u8_doc.splitlines():
if line.startswith('#EXT-X-STREAM-INF:'): if line.startswith('#EXT-X-STREAM-INF:'):
last_stream_inf = parse_m3u8_attributes(line) last_stream_inf = parse_m3u8_attributes(line)
elif line.startswith('#EXT-X-MEDIA:'):
extract_media(line)
elif line.startswith('#') or not line.strip(): elif line.startswith('#') or not line.strip():
continue continue
else: else:
tbr = float_or_none( tbr = float_or_none(
last_stream_inf.get('AVERAGE-BANDWIDTH') or last_stream_inf.get('AVERAGE-BANDWIDTH')
last_stream_inf.get('BANDWIDTH'), scale=1000) or last_stream_inf.get('BANDWIDTH'), scale=1000)
format_id = [] format_id = []
if m3u8_id: if m3u8_id:
format_id.append(m3u8_id) format_id.append(m3u8_id)
@ -1733,6 +1766,19 @@ class InfoExtractor(object):
# the same GROUP-ID # the same GROUP-ID
f['acodec'] = 'none' f['acodec'] = 'none'
formats.append(f) formats.append(f)
# for DailyMotion
progressive_uri = last_stream_inf.get('PROGRESSIVE-URI')
if progressive_uri:
http_f = f.copy()
del http_f['manifest_url']
http_f.update({
'format_id': f['format_id'].replace('hls-', 'http-'),
'protocol': 'http',
'url': progressive_uri,
})
formats.append(http_f)
last_stream_inf = {} last_stream_inf = {}
return formats return formats
@ -1977,15 +2023,17 @@ class InfoExtractor(object):
}) })
return entries return entries
def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, formats_dict={}): def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, formats_dict={}, data=None, headers={}, query={}):
res = self._download_xml_handle( res = self._download_xml_handle(
mpd_url, video_id, mpd_url, video_id,
note=note or 'Downloading MPD manifest', note=note or 'Downloading MPD manifest',
errnote=errnote or 'Failed to download MPD manifest', errnote=errnote or 'Failed to download MPD manifest',
fatal=fatal) fatal=fatal, data=data, headers=headers, query=query)
if res is False: if res is False:
return [] return []
mpd_doc, urlh = res mpd_doc, urlh = res
if mpd_doc is None:
return []
mpd_base_url = base_url(urlh.geturl()) mpd_base_url = base_url(urlh.geturl())
return self._parse_mpd_formats( return self._parse_mpd_formats(
@ -2111,7 +2159,6 @@ class InfoExtractor(object):
bandwidth = int_or_none(representation_attrib.get('bandwidth')) bandwidth = int_or_none(representation_attrib.get('bandwidth'))
f = { f = {
'format_id': '%s-%s' % (mpd_id, representation_id) if mpd_id else representation_id, 'format_id': '%s-%s' % (mpd_id, representation_id) if mpd_id else representation_id,
'url': base_url,
'manifest_url': mpd_url, 'manifest_url': mpd_url,
'ext': mimetype2ext(mime_type), 'ext': mimetype2ext(mime_type),
'width': int_or_none(representation_attrib.get('width')), 'width': int_or_none(representation_attrib.get('width')),
@ -2132,7 +2179,7 @@ class InfoExtractor(object):
# First of, % characters outside $...$ templates # First of, % characters outside $...$ templates
# must be escaped by doubling for proper processing # must be escaped by doubling for proper processing
# by % operator string formatting used further (see # by % operator string formatting used further (see
# https://github.com/rg3/youtube-dl/issues/16867). # https://github.com/ytdl-org/youtube-dl/issues/16867).
t = '' t = ''
in_template = False in_template = False
for c in tmpl: for c in tmpl:
@ -2151,7 +2198,7 @@ class InfoExtractor(object):
# @initialization is a regular template like @media one # @initialization is a regular template like @media one
# so it should be handled just the same way (see # so it should be handled just the same way (see
# https://github.com/rg3/youtube-dl/issues/11605) # https://github.com/ytdl-org/youtube-dl/issues/11605)
if 'initialization' in representation_ms_info: if 'initialization' in representation_ms_info:
initialization_template = prepare_template( initialization_template = prepare_template(
'initialization', 'initialization',
@ -2237,7 +2284,7 @@ class InfoExtractor(object):
elif 'segment_urls' in representation_ms_info: elif 'segment_urls' in representation_ms_info:
# Segment URLs with no SegmentTimeline # Segment URLs with no SegmentTimeline
# Example: https://www.seznam.cz/zpravy/clanek/cesko-zasahne-vitr-o-sile-vichrice-muze-byt-i-zivotu-nebezpecny-39091 # Example: https://www.seznam.cz/zpravy/clanek/cesko-zasahne-vitr-o-sile-vichrice-muze-byt-i-zivotu-nebezpecny-39091
# https://github.com/rg3/youtube-dl/pull/14844 # https://github.com/ytdl-org/youtube-dl/pull/14844
fragments = [] fragments = []
segment_duration = float_or_none( segment_duration = float_or_none(
representation_ms_info['segment_duration'], representation_ms_info['segment_duration'],
@ -2250,10 +2297,14 @@ class InfoExtractor(object):
fragment['duration'] = segment_duration fragment['duration'] = segment_duration
fragments.append(fragment) fragments.append(fragment)
representation_ms_info['fragments'] = fragments representation_ms_info['fragments'] = fragments
# NB: MPD manifest may contain direct URLs to unfragmented media. # If there is a fragments key available then we correctly recognized fragmented media.
# No fragments key is present in this case. # Otherwise we will assume unfragmented media with direct access. Technically, such
# assumption is not necessarily correct since we may simply have no support for
# some forms of fragmented media renditions yet, but for now we'll use this fallback.
if 'fragments' in representation_ms_info: if 'fragments' in representation_ms_info:
f.update({ f.update({
# NB: mpd_url may be empty when MPD manifest is parsed from a string
'url': mpd_url or base_url,
'fragment_base_url': base_url, 'fragment_base_url': base_url,
'fragments': [], 'fragments': [],
'protocol': 'http_dash_segments', 'protocol': 'http_dash_segments',
@ -2264,11 +2315,15 @@ class InfoExtractor(object):
f['url'] = initialization_url f['url'] = initialization_url
f['fragments'].append({location_key(initialization_url): initialization_url}) f['fragments'].append({location_key(initialization_url): initialization_url})
f['fragments'].extend(representation_ms_info['fragments']) f['fragments'].extend(representation_ms_info['fragments'])
else:
# Assuming direct URL to unfragmented media.
f['url'] = base_url
# According to [1, 5.3.5.2, Table 7, page 35] @id of Representation # According to [1, 5.3.5.2, Table 7, page 35] @id of Representation
# is not necessarily unique within a Period thus formats with # is not necessarily unique within a Period thus formats with
# the same `format_id` are quite possible. There are numerous examples # the same `format_id` are quite possible. There are numerous examples
# of such manifests (see https://github.com/rg3/youtube-dl/issues/15111, # of such manifests (see https://github.com/ytdl-org/youtube-dl/issues/15111,
# https://github.com/rg3/youtube-dl/issues/13919) # https://github.com/ytdl-org/youtube-dl/issues/13919)
full_info = formats_dict.get(representation_id, {}).copy() full_info = formats_dict.get(representation_id, {}).copy()
full_info.update(f) full_info.update(f)
formats.append(full_info) formats.append(full_info)
@ -2276,12 +2331,12 @@ class InfoExtractor(object):
self.report_warning('Unknown MIME type %s in DASH manifest' % mime_type) self.report_warning('Unknown MIME type %s in DASH manifest' % mime_type)
return formats return formats
def _extract_ism_formats(self, ism_url, video_id, ism_id=None, note=None, errnote=None, fatal=True): def _extract_ism_formats(self, ism_url, video_id, ism_id=None, note=None, errnote=None, fatal=True, data=None, headers={}, query={}):
res = self._download_xml_handle( res = self._download_xml_handle(
ism_url, video_id, ism_url, video_id,
note=note or 'Downloading ISM manifest', note=note or 'Downloading ISM manifest',
errnote=errnote or 'Failed to download ISM manifest', errnote=errnote or 'Failed to download ISM manifest',
fatal=fatal) fatal=fatal, data=data, headers=headers, query=query)
if res is False: if res is False:
return [] return []
ism_doc, urlh = res ism_doc, urlh = res
@ -2429,7 +2484,7 @@ class InfoExtractor(object):
media_tags.extend(re.findall( media_tags.extend(re.findall(
# We only allow video|audio followed by a whitespace or '>'. # We only allow video|audio followed by a whitespace or '>'.
# Allowing more characters may end up in significant slow down (see # Allowing more characters may end up in significant slow down (see
# https://github.com/rg3/youtube-dl/issues/11979, example URL: # https://github.com/ytdl-org/youtube-dl/issues/11979, example URL:
# http://www.porntrex.com/maps/videositemap.xml). # http://www.porntrex.com/maps/videositemap.xml).
r'(?s)(<(?P<tag>(?:amp-)?(?:video|audio))(?:\s+[^>]*)?>)(.*?)</(?P=tag)>', webpage)) r'(?s)(<(?P<tag>(?:amp-)?(?:video|audio))(?:\s+[^>]*)?>)(.*?)</(?P=tag)>', webpage))
for media_tag, media_type, media_content in media_tags: for media_tag, media_type, media_content in media_tags:
@ -2438,25 +2493,50 @@ class InfoExtractor(object):
'subtitles': {}, 'subtitles': {},
} }
media_attributes = extract_attributes(media_tag) media_attributes = extract_attributes(media_tag)
src = media_attributes.get('src') src = strip_or_none(media_attributes.get('src'))
if src: if src:
_, formats = _media_formats(src, media_type) _, formats = _media_formats(src, media_type)
media_info['formats'].extend(formats) media_info['formats'].extend(formats)
media_info['thumbnail'] = absolute_url(media_attributes.get('poster')) media_info['thumbnail'] = absolute_url(media_attributes.get('poster'))
if media_content: if media_content:
for source_tag in re.findall(r'<source[^>]+>', media_content): for source_tag in re.findall(r'<source[^>]+>', media_content):
source_attributes = extract_attributes(source_tag) s_attr = extract_attributes(source_tag)
src = source_attributes.get('src') # data-video-src and data-src are non standard but seen
# several times in the wild
src = strip_or_none(dict_get(s_attr, ('src', 'data-video-src', 'data-src')))
if not src: if not src:
continue continue
f = parse_content_type(source_attributes.get('type')) f = parse_content_type(s_attr.get('type'))
is_plain_url, formats = _media_formats(src, media_type, f) is_plain_url, formats = _media_formats(src, media_type, f)
if is_plain_url: if is_plain_url:
# res attribute is not standard but seen several times # width, height, res, label and title attributes are
# in the wild # all not standard but seen several times in the wild
labels = [
s_attr.get(lbl)
for lbl in ('label', 'title')
if str_or_none(s_attr.get(lbl))
]
width = int_or_none(s_attr.get('width'))
height = (int_or_none(s_attr.get('height'))
or int_or_none(s_attr.get('res')))
if not width or not height:
for lbl in labels:
resolution = parse_resolution(lbl)
if not resolution:
continue
width = width or resolution.get('width')
height = height or resolution.get('height')
for lbl in labels:
tbr = parse_bitrate(lbl)
if tbr:
break
else:
tbr = None
f.update({ f.update({
'height': int_or_none(source_attributes.get('res')), 'width': width,
'format_id': source_attributes.get('label'), 'height': height,
'tbr': tbr,
'format_id': s_attr.get('label') or s_attr.get('title'),
}) })
f.update(formats[0]) f.update(formats[0])
media_info['formats'].append(f) media_info['formats'].append(f)
@ -2466,7 +2546,7 @@ class InfoExtractor(object):
track_attributes = extract_attributes(track_tag) track_attributes = extract_attributes(track_tag)
kind = track_attributes.get('kind') kind = track_attributes.get('kind')
if not kind or kind in ('subtitles', 'captions'): if not kind or kind in ('subtitles', 'captions'):
src = track_attributes.get('src') src = strip_or_none(track_attributes.get('src'))
if not src: if not src:
continue continue
lang = track_attributes.get('srclang') or track_attributes.get('lang') or track_attributes.get('label') lang = track_attributes.get('srclang') or track_attributes.get('lang') or track_attributes.get('label')
@ -2623,8 +2703,8 @@ class InfoExtractor(object):
entry = { entry = {
'id': this_video_id, 'id': this_video_id,
'title': unescapeHTML(video_data['title'] if require_title else video_data.get('title')), 'title': unescapeHTML(video_data['title'] if require_title else video_data.get('title')),
'description': video_data.get('description'), 'description': clean_html(video_data.get('description')),
'thumbnail': self._proto_relative_url(video_data.get('image')), 'thumbnail': urljoin(base_url, self._proto_relative_url(video_data.get('image'))),
'timestamp': int_or_none(video_data.get('pubdate')), 'timestamp': int_or_none(video_data.get('pubdate')),
'duration': float_or_none(jwplayer_data.get('duration') or video_data.get('duration')), 'duration': float_or_none(jwplayer_data.get('duration') or video_data.get('duration')),
'subtitles': subtitles, 'subtitles': subtitles,
@ -2651,12 +2731,9 @@ class InfoExtractor(object):
for source in jwplayer_sources_data: for source in jwplayer_sources_data:
if not isinstance(source, dict): if not isinstance(source, dict):
continue continue
source_url = self._proto_relative_url(source.get('file')) source_url = urljoin(
if not source_url: base_url, self._proto_relative_url(source.get('file')))
continue if not source_url or source_url in urls:
if base_url:
source_url = compat_urlparse.urljoin(base_url, source_url)
if source_url in urls:
continue continue
urls.append(source_url) urls.append(source_url)
source_type = source.get('type') or '' source_type = source.get('type') or ''
@ -2753,6 +2830,33 @@ class InfoExtractor(object):
self._downloader.cookiejar.add_cookie_header(req) self._downloader.cookiejar.add_cookie_header(req)
return compat_cookies.SimpleCookie(req.get_header('Cookie')) return compat_cookies.SimpleCookie(req.get_header('Cookie'))
def _apply_first_set_cookie_header(self, url_handle, cookie):
"""
Apply first Set-Cookie header instead of the last. Experimental.
Some sites (e.g. [1-3]) may serve two cookies under the same name
in Set-Cookie header and expect the first (old) one to be set rather
than second (new). However, as of RFC6265 the newer one cookie
should be set into cookie store what actually happens.
We will workaround this issue by resetting the cookie to
the first one manually.
1. https://new.vk.com/
2. https://github.com/ytdl-org/youtube-dl/issues/9841#issuecomment-227871201
3. https://learning.oreilly.com/
"""
for header, cookies in url_handle.headers.items():
if header.lower() != 'set-cookie':
continue
if sys.version_info[0] >= 3:
cookies = cookies.encode('iso-8859-1')
cookies = cookies.decode('utf-8')
cookie_value = re.search(
r'%s=(.+?);.*?\b[Dd]omain=(.+?)(?:[,;]|$)' % cookie, cookies)
if cookie_value:
value, domain = cookie_value.groups()
self._set_cookie(domain, cookie, value)
break
def get_testcases(self, include_onlymatching=False): def get_testcases(self, include_onlymatching=False):
t = getattr(self, '_TEST', None) t = getattr(self, '_TEST', None)
if t: if t:
@ -2783,8 +2887,8 @@ class InfoExtractor(object):
return not any_restricted return not any_restricted
def extract_subtitles(self, *args, **kwargs): def extract_subtitles(self, *args, **kwargs):
if (self._downloader.params.get('writesubtitles', False) or if (self._downloader.params.get('writesubtitles', False)
self._downloader.params.get('listsubtitles')): or self._downloader.params.get('listsubtitles')):
return self._get_subtitles(*args, **kwargs) return self._get_subtitles(*args, **kwargs)
return {} return {}
@ -2809,8 +2913,8 @@ class InfoExtractor(object):
return ret return ret
def extract_automatic_captions(self, *args, **kwargs): def extract_automatic_captions(self, *args, **kwargs):
if (self._downloader.params.get('writeautomaticsub', False) or if (self._downloader.params.get('writeautomaticsub', False)
self._downloader.params.get('listsubtitles')): or self._downloader.params.get('listsubtitles')):
return self._get_automatic_captions(*args, **kwargs) return self._get_automatic_captions(*args, **kwargs)
return {} return {}
@ -2818,9 +2922,9 @@ class InfoExtractor(object):
raise NotImplementedError('This method must be implemented by subclasses') raise NotImplementedError('This method must be implemented by subclasses')
def mark_watched(self, *args, **kwargs): def mark_watched(self, *args, **kwargs):
if (self._downloader.params.get('mark_watched', False) and if (self._downloader.params.get('mark_watched', False)
(self._get_login_info()[0] is not None or and (self._get_login_info()[0] is not None
self._downloader.params.get('cookiefile') is not None)): or self._downloader.params.get('cookiefile') is not None)):
self._mark_watched(*args, **kwargs) self._mark_watched(*args, **kwargs)
def _mark_watched(self, *args, **kwargs): def _mark_watched(self, *args, **kwargs):

View File

@ -36,7 +36,7 @@ class UnicodeBOMIE(InfoExtractor):
_VALID_URL = r'(?P<bom>\ufeff)(?P<id>.*)$' _VALID_URL = r'(?P<bom>\ufeff)(?P<id>.*)$'
# Disable test for python 3.2 since BOM is broken in re in this version # Disable test for python 3.2 since BOM is broken in re in this version
# (see https://github.com/rg3/youtube-dl/issues/9751) # (see https://github.com/ytdl-org/youtube-dl/issues/9751)
_TESTS = [] if (3, 0) < sys.version_info <= (3, 3) else [{ _TESTS = [] if (3, 0) < sys.version_info <= (3, 3) else [{
'url': '\ufeffhttp://www.youtube.com/watch?v=BaW_jenozKc', 'url': '\ufeffhttp://www.youtube.com/watch?v=BaW_jenozKc',
'only_matching': True, 'only_matching': True,

Some files were not shown because too many files have changed in this diff Show More