Merge remote-tracking branch 'origin/master' into pr/4585

This commit is contained in:
pukkandan 2022-10-19 12:47:38 +05:30
commit 1b9c25cc42
No known key found for this signature in database
GPG Key ID: 7EEE9E1E817D0A39
204 changed files with 11625 additions and 4420 deletions

View File

@ -2,6 +2,13 @@ name: Broken site
description: Report broken or misfunctioning site description: Report broken or misfunctioning site
labels: [triage, site-bug] labels: [triage, site-bug]
body: body:
- type: checkboxes
attributes:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
description: Fill all fields even if you think it is irrelevant for the issue
options:
- label: I understand that I will be **blocked** if I remove or skip any mandatory\* field
required: true
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:
@ -11,11 +18,11 @@ body:
options: options:
- label: I'm reporting a broken site - label: I'm reporting a broken site
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.07.18** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.10.04** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true required: true
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/ytdl-org/youtube-dl#video-url-contains-an-ampersand-and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command) - label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
required: true required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true required: true
@ -55,7 +62,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.07.18 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.10.04 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@ -63,8 +70,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.07.18, Current version: 2022.07.18 Latest version: 2022.10.04, Current version: 2022.10.04
yt-dlp is up to date (2022.07.18) yt-dlp is up to date (2022.10.04)
<more lines> <more lines>
render: shell render: shell
validations: validations:

View File

@ -2,6 +2,13 @@ name: Site support request
description: Request support for a new site description: Request support for a new site
labels: [triage, site-request] labels: [triage, site-request]
body: body:
- type: checkboxes
attributes:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
description: Fill all fields even if you think it is irrelevant for the issue
options:
- label: I understand that I will be **blocked** if I remove or skip any mandatory\* field
required: true
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:
@ -11,11 +18,11 @@ body:
options: options:
- label: I'm reporting a new site support request - label: I'm reporting a new site support request
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.07.18** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.10.04** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true required: true
- label: I've checked that none of provided URLs [violate any copyrights](https://github.com/ytdl-org/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free) or contain any [DRM](https://en.wikipedia.org/wiki/Digital_rights_management) to the best of my knowledge - label: I've checked that none of provided URLs [violate any copyrights](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy) or contain any [DRM](https://en.wikipedia.org/wiki/Digital_rights_management) to the best of my knowledge
required: true required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true required: true
@ -67,7 +74,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.07.18 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.10.04 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@ -75,8 +82,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.07.18, Current version: 2022.07.18 Latest version: 2022.10.04, Current version: 2022.10.04
yt-dlp is up to date (2022.07.18) yt-dlp is up to date (2022.10.04)
<more lines> <more lines>
render: shell render: shell
validations: validations:

View File

@ -2,6 +2,13 @@ name: Site feature request
description: Request a new functionality for a supported site description: Request a new functionality for a supported site
labels: [triage, site-enhancement] labels: [triage, site-enhancement]
body: body:
- type: checkboxes
attributes:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
description: Fill all fields even if you think it is irrelevant for the issue
options:
- label: I understand that I will be **blocked** if I remove or skip any mandatory\* field
required: true
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:
@ -11,7 +18,7 @@ body:
options: options:
- label: I'm requesting a site-specific feature - label: I'm requesting a site-specific feature
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.07.18** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.10.04** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true required: true
@ -63,7 +70,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.07.18 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.10.04 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@ -71,8 +78,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.07.18, Current version: 2022.07.18 Latest version: 2022.10.04, Current version: 2022.10.04
yt-dlp is up to date (2022.07.18) yt-dlp is up to date (2022.10.04)
<more lines> <more lines>
render: shell render: shell
validations: validations:

View File

@ -2,6 +2,13 @@ name: Bug report
description: Report a bug unrelated to any particular site or extractor description: Report a bug unrelated to any particular site or extractor
labels: [triage, bug] labels: [triage, bug]
body: body:
- type: checkboxes
attributes:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
description: Fill all fields even if you think it is irrelevant for the issue
options:
- label: I understand that I will be **blocked** if I remove or skip any mandatory\* field
required: true
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:
@ -11,11 +18,11 @@ body:
options: options:
- label: I'm reporting a bug unrelated to a specific site - label: I'm reporting a bug unrelated to a specific site
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.07.18** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.10.04** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true required: true
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/ytdl-org/youtube-dl#video-url-contains-an-ampersand-and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command) - label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
required: true required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true required: true
@ -48,7 +55,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.07.18 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.10.04 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@ -56,8 +63,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.07.18, Current version: 2022.07.18 Latest version: 2022.10.04, Current version: 2022.10.04
yt-dlp is up to date (2022.07.18) yt-dlp is up to date (2022.10.04)
<more lines> <more lines>
render: shell render: shell
validations: validations:

View File

@ -2,6 +2,13 @@ name: Feature request
description: Request a new functionality unrelated to any particular site or extractor description: Request a new functionality unrelated to any particular site or extractor
labels: [triage, enhancement] labels: [triage, enhancement]
body: body:
- type: checkboxes
attributes:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
description: Fill all fields even if you think it is irrelevant for the issue
options:
- label: I understand that I will be **blocked** if I remove or skip any mandatory\* field
required: true
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:
@ -13,7 +20,7 @@ body:
required: true required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme) - label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.07.18** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.10.04** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true required: true
@ -44,7 +51,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.07.18 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.10.04 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@ -52,7 +59,7 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.07.18, Current version: 2022.07.18 Latest version: 2022.10.04, Current version: 2022.10.04
yt-dlp is up to date (2022.07.18) yt-dlp is up to date (2022.10.04)
<more lines> <more lines>
render: shell render: shell

View File

@ -2,12 +2,19 @@ name: Ask question
description: Ask yt-dlp related question description: Ask yt-dlp related question
labels: [question] labels: [question]
body: body:
- type: checkboxes
attributes:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
description: Fill all fields even if you think it is irrelevant for the issue
options:
- label: I understand that I will be **blocked** if I remove or skip any mandatory\* field
required: true
- type: markdown - type: markdown
attributes: attributes:
value: | value: |
### Make sure you are **only** asking a question and not reporting a bug or requesting a feature. ### Make sure you are **only** asking a question and not reporting a bug or requesting a feature.
If your question contains "isn't working" or "can you add", this is most likely the wrong template. If your question contains "isn't working" or "can you add", this is most likely the wrong template.
If you are in doubt whether this is the right template, **use another template**! If you are in doubt whether this is the right template, **USE ANOTHER TEMPLATE**!
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:
@ -19,7 +26,7 @@ body:
required: true required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme) - label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.07.18** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.10.04** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions **including closed ones**. DO NOT post duplicates
required: true required: true
@ -50,7 +57,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.07.18 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.10.04 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@ -58,7 +65,7 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.07.18, Current version: 2022.07.18 Latest version: 2022.10.04, Current version: 2022.10.04
yt-dlp is up to date (2022.07.18) yt-dlp is up to date (2022.10.04)
<more lines> <more lines>
render: shell render: shell

View File

@ -2,6 +2,7 @@ name: Broken site
description: Report broken or misfunctioning site description: Report broken or misfunctioning site
labels: [triage, site-bug] labels: [triage, site-bug]
body: body:
%(no_skip)s
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:
@ -15,7 +16,7 @@ body:
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true required: true
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/ytdl-org/youtube-dl#video-url-contains-an-ampersand-and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command) - label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
required: true required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true required: true

View File

@ -2,6 +2,7 @@ name: Site support request
description: Request support for a new site description: Request support for a new site
labels: [triage, site-request] labels: [triage, site-request]
body: body:
%(no_skip)s
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:
@ -15,7 +16,7 @@ body:
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true required: true
- label: I've checked that none of provided URLs [violate any copyrights](https://github.com/ytdl-org/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free) or contain any [DRM](https://en.wikipedia.org/wiki/Digital_rights_management) to the best of my knowledge - label: I've checked that none of provided URLs [violate any copyrights](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy) or contain any [DRM](https://en.wikipedia.org/wiki/Digital_rights_management) to the best of my knowledge
required: true required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true required: true

View File

@ -2,6 +2,7 @@ name: Site feature request
description: Request a new functionality for a supported site description: Request a new functionality for a supported site
labels: [triage, site-enhancement] labels: [triage, site-enhancement]
body: body:
%(no_skip)s
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:

View File

@ -2,6 +2,7 @@ name: Bug report
description: Report a bug unrelated to any particular site or extractor description: Report a bug unrelated to any particular site or extractor
labels: [triage, bug] labels: [triage, bug]
body: body:
%(no_skip)s
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:
@ -15,7 +16,7 @@ body:
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true required: true
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/ytdl-org/youtube-dl#video-url-contains-an-ampersand-and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command) - label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
required: true required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true required: true

View File

@ -2,6 +2,7 @@ name: Feature request
description: Request a new functionality unrelated to any particular site or extractor description: Request a new functionality unrelated to any particular site or extractor
labels: [triage, enhancement] labels: [triage, enhancement]
body: body:
%(no_skip)s
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:

View File

@ -2,12 +2,13 @@ name: Ask question
description: Ask yt-dlp related question description: Ask yt-dlp related question
labels: [question] labels: [question]
body: body:
%(no_skip)s
- type: markdown - type: markdown
attributes: attributes:
value: | value: |
### Make sure you are **only** asking a question and not reporting a bug or requesting a feature. ### Make sure you are **only** asking a question and not reporting a bug or requesting a feature.
If your question contains "isn't working" or "can you add", this is most likely the wrong template. If your question contains "isn't working" or "can you add", this is most likely the wrong template.
If you are in doubt whether this is the right template, **use another template**! If you are in doubt whether this is the right template, **USE ANOTHER TEMPLATE**!
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:

View File

@ -1,3 +1,5 @@
**IMPORTANT**: PRs without the template will be CLOSED
### Description of your *pull request* and other information ### Description of your *pull request* and other information
</details> </details>

View File

@ -2,18 +2,17 @@ name: Build
on: workflow_dispatch on: workflow_dispatch
jobs: jobs:
create_release: prepare:
runs-on: ubuntu-latest runs-on: ubuntu-latest
outputs: outputs:
version_suffix: ${{ steps.version_suffix.outputs.version_suffix }} version_suffix: ${{ steps.version_suffix.outputs.version_suffix }}
ytdlp_version: ${{ steps.bump_version.outputs.ytdlp_version }} ytdlp_version: ${{ steps.bump_version.outputs.ytdlp_version }}
upload_url: ${{ steps.create_release.outputs.upload_url }} head_sha: ${{ steps.push_release.outputs.head_sha }}
release_id: ${{ steps.create_release.outputs.id }}
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v3
with: with:
fetch-depth: 0 fetch-depth: 0
- uses: actions/setup-python@v2 - uses: actions/setup-python@v4
with: with:
python-version: '3.10' python-version: '3.10'
@ -43,53 +42,15 @@ jobs:
PUSH_VERSION_COMMIT: ${{ secrets.PUSH_VERSION_COMMIT }} PUSH_VERSION_COMMIT: ${{ secrets.PUSH_VERSION_COMMIT }}
if: "env.PUSH_VERSION_COMMIT != ''" if: "env.PUSH_VERSION_COMMIT != ''"
run: git push origin ${{ github.event.ref }} run: git push origin ${{ github.event.ref }}
- name: Get Changelog
run: |
changelog=$(grep -oPz '(?s)(?<=### ${{ steps.bump_version.outputs.ytdlp_version }}\n{2}).+?(?=\n{2,3}###)' Changelog.md) || true
echo "changelog<<EOF" >> $GITHUB_ENV
echo "$changelog" >> $GITHUB_ENV
echo "EOF" >> $GITHUB_ENV
- name: Create Release
id: create_release
uses: actions/create-release@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
tag_name: ${{ steps.bump_version.outputs.ytdlp_version }}
release_name: yt-dlp ${{ steps.bump_version.outputs.ytdlp_version }}
commitish: ${{ steps.push_release.outputs.head_sha }}
draft: true
prerelease: false
body: |
#### [A description of the various files]((https://github.com/yt-dlp/yt-dlp#release-files)) are in the README
---
<details open><summary><h3>Changelog</summary>
<p>
${{ env.changelog }}
</p>
</details>
build_unix: build_unix:
needs: create_release needs: prepare
runs-on: ubuntu-18.04 # Standalone executable should be built on minimum supported OS runs-on: ubuntu-18.04 # Standalone executable should be built on minimum supported OS
outputs:
sha256_bin: ${{ steps.get_sha.outputs.sha256_bin }}
sha512_bin: ${{ steps.get_sha.outputs.sha512_bin }}
sha256_tar: ${{ steps.get_sha.outputs.sha256_tar }}
sha512_tar: ${{ steps.get_sha.outputs.sha512_tar }}
sha256_linux: ${{ steps.get_sha.outputs.sha256_linux }}
sha512_linux: ${{ steps.get_sha.outputs.sha512_linux }}
sha256_linux_zip: ${{ steps.get_sha.outputs.sha256_linux_zip }}
sha512_linux_zip: ${{ steps.get_sha.outputs.sha512_linux_zip }}
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v3
- uses: actions/setup-python@v2 - uses: actions/setup-python@v4
with: with:
python-version: '3.10' python-version: '3.10'
- name: Install Requirements - name: Install Requirements
@ -100,7 +61,7 @@ jobs:
- name: Prepare - name: Prepare
run: | run: |
python devscripts/update-version.py ${{ needs.create_release.outputs.version_suffix }} python devscripts/update-version.py ${{ needs.prepare.outputs.version_suffix }}
python devscripts/make_lazy_extractors.py python devscripts/make_lazy_extractors.py
- name: Build Unix executables - name: Build Unix executables
run: | run: |
@ -111,51 +72,15 @@ jobs:
- name: Get SHA2-SUMS - name: Get SHA2-SUMS
id: get_sha id: get_sha
run: | run: |
echo "::set-output name=sha256_bin::$(sha256sum yt-dlp | awk '{print $1}')"
echo "::set-output name=sha512_bin::$(sha512sum yt-dlp | awk '{print $1}')"
echo "::set-output name=sha256_tar::$(sha256sum yt-dlp.tar.gz | awk '{print $1}')"
echo "::set-output name=sha512_tar::$(sha512sum yt-dlp.tar.gz | awk '{print $1}')"
echo "::set-output name=sha256_linux::$(sha256sum dist/yt-dlp_linux | awk '{print $1}')"
echo "::set-output name=sha512_linux::$(sha512sum dist/yt-dlp_linux | awk '{print $1}')"
echo "::set-output name=sha256_linux_zip::$(sha256sum dist/yt-dlp_linux.zip | awk '{print $1}')"
echo "::set-output name=sha512_linux_zip::$(sha512sum dist/yt-dlp_linux.zip | awk '{print $1}')"
- name: Upload zip binary - name: Upload artifacts
uses: actions/upload-release-asset@v1 uses: actions/upload-artifact@v3
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with: with:
upload_url: ${{ needs.create_release.outputs.upload_url }} path: |
asset_path: ./yt-dlp yt-dlp
asset_name: yt-dlp yt-dlp.tar.gz
asset_content_type: application/octet-stream dist/yt-dlp_linux
- name: Upload Source tar dist/yt-dlp_linux.zip
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
asset_path: ./yt-dlp.tar.gz
asset_name: yt-dlp.tar.gz
asset_content_type: application/gzip
- name: Upload standalone binary
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
asset_path: ./dist/yt-dlp_linux
asset_name: yt-dlp_linux
asset_content_type: application/octet-stream
- name: Upload onedir binary
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
asset_path: ./dist/yt-dlp_linux.zip
asset_name: yt-dlp_linux.zip
asset_content_type: application/zip
- name: Build and publish on PyPi - name: Build and publish on PyPi
env: env:
@ -164,6 +89,7 @@ jobs:
if: "env.TWINE_PASSWORD != ''" if: "env.TWINE_PASSWORD != ''"
run: | run: |
rm -rf dist/* rm -rf dist/*
python devscripts/set-variant.py pip -M "You installed yt-dlp with pip or using the wheel from PyPi; Use that to update"
python setup.py sdist bdist_wheel python setup.py sdist bdist_wheel
twine upload dist/* twine upload dist/*
@ -180,24 +106,19 @@ jobs:
if: "env.BREW_TOKEN != ''" if: "env.BREW_TOKEN != ''"
run: | run: |
git clone git@github.com:yt-dlp/homebrew-taps taps/ git clone git@github.com:yt-dlp/homebrew-taps taps/
python devscripts/update-formulae.py taps/Formula/yt-dlp.rb "${{ needs.create_release.outputs.ytdlp_version }}" python devscripts/update-formulae.py taps/Formula/yt-dlp.rb "${{ needs.prepare.outputs.ytdlp_version }}"
git -C taps/ config user.name github-actions git -C taps/ config user.name github-actions
git -C taps/ config user.email github-actions@example.com git -C taps/ config user.email github-actions@example.com
git -C taps/ commit -am 'yt-dlp: ${{ needs.create_release.outputs.ytdlp_version }}' git -C taps/ commit -am 'yt-dlp: ${{ needs.prepare.outputs.ytdlp_version }}'
git -C taps/ push git -C taps/ push
build_macos: build_macos:
runs-on: macos-11 runs-on: macos-11
needs: create_release needs: prepare
outputs:
sha256_macos: ${{ steps.get_sha.outputs.sha256_macos }}
sha512_macos: ${{ steps.get_sha.outputs.sha512_macos }}
sha256_macos_zip: ${{ steps.get_sha.outputs.sha256_macos_zip }}
sha512_macos_zip: ${{ steps.get_sha.outputs.sha512_macos_zip }}
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v3
# NB: In order to create a universal2 application, the version of python3 in /usr/bin has to be used # NB: In order to create a universal2 application, the version of python3 in /usr/bin has to be used
- name: Install Requirements - name: Install Requirements
run: | run: |
@ -206,50 +127,28 @@ jobs:
- name: Prepare - name: Prepare
run: | run: |
/usr/bin/python3 devscripts/update-version.py ${{ needs.create_release.outputs.version_suffix }} /usr/bin/python3 devscripts/update-version.py ${{ needs.prepare.outputs.version_suffix }}
/usr/bin/python3 devscripts/make_lazy_extractors.py /usr/bin/python3 devscripts/make_lazy_extractors.py
- name: Build - name: Build
run: | run: |
/usr/bin/python3 pyinst.py --target-architecture universal2 --onedir /usr/bin/python3 pyinst.py --target-architecture universal2 --onedir
(cd ./dist/yt-dlp_macos && zip -r ../yt-dlp_macos.zip .) (cd ./dist/yt-dlp_macos && zip -r ../yt-dlp_macos.zip .)
/usr/bin/python3 pyinst.py --target-architecture universal2 /usr/bin/python3 pyinst.py --target-architecture universal2
- name: Get SHA2-SUMS
id: get_sha
run: |
echo "::set-output name=sha256_macos::$(sha256sum dist/yt-dlp_macos | awk '{print $1}')"
echo "::set-output name=sha512_macos::$(sha512sum dist/yt-dlp_macos | awk '{print $1}')"
echo "::set-output name=sha256_macos_zip::$(sha256sum dist/yt-dlp_macos.zip | awk '{print $1}')"
echo "::set-output name=sha512_macos_zip::$(sha512sum dist/yt-dlp_macos.zip | awk '{print $1}')"
- name: Upload standalone binary - name: Upload artifacts
uses: actions/upload-release-asset@v1 uses: actions/upload-artifact@v3
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with: with:
upload_url: ${{ needs.create_release.outputs.upload_url }} path: |
asset_path: ./dist/yt-dlp_macos dist/yt-dlp_macos
asset_name: yt-dlp_macos dist/yt-dlp_macos.zip
asset_content_type: application/octet-stream
- name: Upload onedir binary
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
asset_path: ./dist/yt-dlp_macos.zip
asset_name: yt-dlp_macos.zip
asset_content_type: application/zip
build_macos_legacy: build_macos_legacy:
runs-on: macos-latest runs-on: macos-latest
needs: create_release needs: prepare
outputs:
sha256_macos_legacy: ${{ steps.get_sha.outputs.sha256_macos_legacy }}
sha512_macos_legacy: ${{ steps.get_sha.outputs.sha512_macos_legacy }}
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v3
- name: Install Python - name: Install Python
# We need the official Python, because the GA ones only support newer macOS versions # We need the official Python, because the GA ones only support newer macOS versions
env: env:
@ -269,52 +168,37 @@ jobs:
- name: Prepare - name: Prepare
run: | run: |
python3 devscripts/update-version.py ${{ needs.create_release.outputs.version_suffix }} python3 devscripts/update-version.py ${{ needs.prepare.outputs.version_suffix }}
python3 devscripts/make_lazy_extractors.py python3 devscripts/make_lazy_extractors.py
- name: Build - name: Build
run: | run: |
python3 pyinst.py python3 pyinst.py
- name: Get SHA2-SUMS mv dist/yt-dlp_macos dist/yt-dlp_macos_legacy
id: get_sha
run: |
echo "::set-output name=sha256_macos_legacy::$(sha256sum dist/yt-dlp_macos | awk '{print $1}')"
echo "::set-output name=sha512_macos_legacy::$(sha512sum dist/yt-dlp_macos | awk '{print $1}')"
- name: Upload standalone binary - name: Upload artifacts
uses: actions/upload-release-asset@v1 uses: actions/upload-artifact@v3
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with: with:
upload_url: ${{ needs.create_release.outputs.upload_url }} path: |
asset_path: ./dist/yt-dlp_macos dist/yt-dlp_macos_legacy
asset_name: yt-dlp_macos_legacy
asset_content_type: application/octet-stream
build_windows: build_windows:
runs-on: windows-latest runs-on: windows-latest
needs: create_release needs: prepare
outputs:
sha256_win: ${{ steps.get_sha.outputs.sha256_win }}
sha512_win: ${{ steps.get_sha.outputs.sha512_win }}
sha256_py2exe: ${{ steps.get_sha.outputs.sha256_py2exe }}
sha512_py2exe: ${{ steps.get_sha.outputs.sha512_py2exe }}
sha256_win_zip: ${{ steps.get_sha.outputs.sha256_win_zip }}
sha512_win_zip: ${{ steps.get_sha.outputs.sha512_win_zip }}
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v3
- uses: actions/setup-python@v2 - uses: actions/setup-python@v4
with: # 3.8 is used for Win7 support with: # 3.8 is used for Win7 support
python-version: '3.8' python-version: '3.8'
- name: Install Requirements - name: Install Requirements
run: | # Custom pyinstaller built with https://github.com/yt-dlp/pyinstaller-builds run: | # Custom pyinstaller built with https://github.com/yt-dlp/pyinstaller-builds
python -m pip install --upgrade pip setuptools wheel py2exe python -m pip install --upgrade pip setuptools wheel "py2exe<0.12"
pip install "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-5.2-py3-none-any.whl" -r requirements.txt pip install "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-5.3-py3-none-any.whl" -r requirements.txt
- name: Prepare - name: Prepare
run: | run: |
python devscripts/update-version.py ${{ needs.create_release.outputs.version_suffix }} python devscripts/update-version.py ${{ needs.prepare.outputs.version_suffix }}
python devscripts/make_lazy_extractors.py python devscripts/make_lazy_extractors.py
- name: Build - name: Build
run: | run: |
@ -323,154 +207,118 @@ jobs:
python pyinst.py python pyinst.py
python pyinst.py --onedir python pyinst.py --onedir
Compress-Archive -Path ./dist/yt-dlp/* -DestinationPath ./dist/yt-dlp_win.zip Compress-Archive -Path ./dist/yt-dlp/* -DestinationPath ./dist/yt-dlp_win.zip
- name: Get SHA2-SUMS
id: get_sha
run: |
echo "::set-output name=sha256_py2exe::$((Get-FileHash dist\yt-dlp_min.exe -Algorithm SHA256).Hash.ToLower())"
echo "::set-output name=sha512_py2exe::$((Get-FileHash dist\yt-dlp_min.exe -Algorithm SHA512).Hash.ToLower())"
echo "::set-output name=sha256_win::$((Get-FileHash dist\yt-dlp.exe -Algorithm SHA256).Hash.ToLower())"
echo "::set-output name=sha512_win::$((Get-FileHash dist\yt-dlp.exe -Algorithm SHA512).Hash.ToLower())"
echo "::set-output name=sha256_win_zip::$((Get-FileHash dist\yt-dlp_win.zip -Algorithm SHA256).Hash.ToLower())"
echo "::set-output name=sha512_win_zip::$((Get-FileHash dist\yt-dlp_win.zip -Algorithm SHA512).Hash.ToLower())"
- name: Upload py2exe binary - name: Upload artifacts
uses: actions/upload-release-asset@v1 uses: actions/upload-artifact@v3
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with: with:
upload_url: ${{ needs.create_release.outputs.upload_url }} path: |
asset_path: ./dist/yt-dlp_min.exe dist/yt-dlp.exe
asset_name: yt-dlp_min.exe dist/yt-dlp_min.exe
asset_content_type: application/vnd.microsoft.portable-executable dist/yt-dlp_win.zip
- name: Upload standalone binary
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
asset_path: ./dist/yt-dlp.exe
asset_name: yt-dlp.exe
asset_content_type: application/vnd.microsoft.portable-executable
- name: Upload onedir binary
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
asset_path: ./dist/yt-dlp_win.zip
asset_name: yt-dlp_win.zip
asset_content_type: application/zip
build_windows32: build_windows32:
runs-on: windows-latest runs-on: windows-latest
needs: create_release needs: prepare
outputs:
sha256_win32: ${{ steps.get_sha.outputs.sha256_win32 }}
sha512_win32: ${{ steps.get_sha.outputs.sha512_win32 }}
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v3
- uses: actions/setup-python@v2 - uses: actions/setup-python@v4
with: # 3.7 is used for Vista support. See https://github.com/yt-dlp/yt-dlp/issues/390 with: # 3.7 is used for Vista support. See https://github.com/yt-dlp/yt-dlp/issues/390
python-version: '3.7' python-version: '3.7'
architecture: 'x86' architecture: 'x86'
- name: Install Requirements - name: Install Requirements
run: | run: |
python -m pip install --upgrade pip setuptools wheel python -m pip install --upgrade pip setuptools wheel
pip install "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-5.2-py3-none-any.whl" -r requirements.txt pip install "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-5.3-py3-none-any.whl" -r requirements.txt
- name: Prepare - name: Prepare
run: | run: |
python devscripts/update-version.py ${{ needs.create_release.outputs.version_suffix }} python devscripts/update-version.py ${{ needs.prepare.outputs.version_suffix }}
python devscripts/make_lazy_extractors.py python devscripts/make_lazy_extractors.py
- name: Build - name: Build
run: | run: |
python pyinst.py python pyinst.py
- name: Get SHA2-SUMS
id: get_sha
run: |
echo "::set-output name=sha256_win32::$((Get-FileHash dist\yt-dlp_x86.exe -Algorithm SHA256).Hash.ToLower())"
echo "::set-output name=sha512_win32::$((Get-FileHash dist\yt-dlp_x86.exe -Algorithm SHA512).Hash.ToLower())"
- name: Upload standalone binary - name: Upload artifacts
uses: actions/upload-release-asset@v1 uses: actions/upload-artifact@v3
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with: with:
upload_url: ${{ needs.create_release.outputs.upload_url }} path: |
asset_path: ./dist/yt-dlp_x86.exe dist/yt-dlp_x86.exe
asset_name: yt-dlp_x86.exe
asset_content_type: application/vnd.microsoft.portable-executable
finish: publish_release:
runs-on: ubuntu-latest runs-on: ubuntu-latest
needs: [create_release, build_unix, build_windows, build_windows32, build_macos, build_macos_legacy] needs: [prepare, build_unix, build_windows, build_windows32, build_macos, build_macos_legacy]
steps: steps:
- name: Make SHA2-SUMS files - uses: actions/checkout@v3
- uses: actions/download-artifact@v3
- name: Get Changelog
run: | run: |
echo "${{ needs.build_unix.outputs.sha256_bin }} yt-dlp" >> SHA2-256SUMS changelog=$(grep -oPz '(?s)(?<=### ${{ needs.prepare.outputs.ytdlp_version }}\n{2}).+?(?=\n{2,3}###)' Changelog.md) || true
echo "${{ needs.build_unix.outputs.sha256_tar }} yt-dlp.tar.gz" >> SHA2-256SUMS echo "changelog<<EOF" >> $GITHUB_ENV
echo "${{ needs.build_unix.outputs.sha256_linux }} yt-dlp_linux" >> SHA2-256SUMS echo "$changelog" >> $GITHUB_ENV
echo "${{ needs.build_unix.outputs.sha256_linux_zip }} yt-dlp_linux.zip" >> SHA2-256SUMS echo "EOF" >> $GITHUB_ENV
echo "${{ needs.build_windows.outputs.sha256_win }} yt-dlp.exe" >> SHA2-256SUMS
echo "${{ needs.build_windows.outputs.sha256_py2exe }} yt-dlp_min.exe" >> SHA2-256SUMS
echo "${{ needs.build_windows32.outputs.sha256_win32 }} yt-dlp_x86.exe" >> SHA2-256SUMS
echo "${{ needs.build_windows.outputs.sha256_win_zip }} yt-dlp_win.zip" >> SHA2-256SUMS
echo "${{ needs.build_macos.outputs.sha256_macos }} yt-dlp_macos" >> SHA2-256SUMS
echo "${{ needs.build_macos.outputs.sha256_macos_zip }} yt-dlp_macos.zip" >> SHA2-256SUMS
echo "${{ needs.build_macos_legacy.outputs.sha256_macos_legacy }} yt-dlp_macos_legacy" >> SHA2-256SUMS
echo "${{ needs.build_unix.outputs.sha512_bin }} yt-dlp" >> SHA2-512SUMS
echo "${{ needs.build_unix.outputs.sha512_tar }} yt-dlp.tar.gz" >> SHA2-512SUMS
echo "${{ needs.build_unix.outputs.sha512_linux }} yt-dlp_linux" >> SHA2-512SUMS
echo "${{ needs.build_unix.outputs.sha512_linux_zip }} yt-dlp_linux.zip" >> SHA2-512SUMS
echo "${{ needs.build_windows.outputs.sha512_win }} yt-dlp.exe" >> SHA2-512SUMS
echo "${{ needs.build_windows.outputs.sha512_py2exe }} yt-dlp_min.exe" >> SHA2-512SUMS
echo "${{ needs.build_windows32.outputs.sha512_win32 }} yt-dlp_x86.exe" >> SHA2-512SUMS
echo "${{ needs.build_windows.outputs.sha512_win_zip }} yt-dlp_win.zip" >> SHA2-512SUMS
echo "${{ needs.build_macos.outputs.sha512_macos }} yt-dlp_macos" >> SHA2-512SUMS
echo "${{ needs.build_macos.outputs.sha512_macos_zip }} yt-dlp_macos.zip" >> SHA2-512SUMS
echo "${{ needs.build_macos_legacy.outputs.sha512_macos_legacy }} yt-dlp_macos_legacy" >> SHA2-512SUMS
- name: Upload SHA2-256SUMS file
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
asset_path: ./SHA2-256SUMS
asset_name: SHA2-256SUMS
asset_content_type: text/plain
- name: Upload SHA2-512SUMS file
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
asset_path: ./SHA2-512SUMS
asset_name: SHA2-512SUMS
asset_content_type: text/plain
- name: Make Update spec - name: Make Update spec
run: | run: |
echo "# This file is used for regulating self-update" >> _update_spec echo "# This file is used for regulating self-update" >> _update_spec
echo "lock 2022.07.18 .+ Python 3.6" >> _update_spec echo "lock 2022.07.18 .+ Python 3.6" >> _update_spec
- name: Upload update spec - name: Make SHA2-SUMS files
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
asset_path: ./_update_spec
asset_name: _update_spec
asset_content_type: text/plain
- name: Finalize release
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: | run: |
gh api -X PATCH -H "Accept: application/vnd.github.v3+json" \ sha256sum artifact/yt-dlp | awk '{print $1 " yt-dlp"}' >> SHA2-256SUMS
/repos/${{ github.repository }}/releases/${{ needs.create_release.outputs.release_id }} \ sha256sum artifact/yt-dlp.tar.gz | awk '{print $1 " yt-dlp.tar.gz"}' >> SHA2-256SUMS
-F draft=false sha256sum artifact/yt-dlp.exe | awk '{print $1 " yt-dlp.exe"}' >> SHA2-256SUMS
sha256sum artifact/yt-dlp_win.zip | awk '{print $1 " yt-dlp_win.zip"}' >> SHA2-256SUMS
sha256sum artifact/yt-dlp_min.exe | awk '{print $1 " yt-dlp_min.exe"}' >> SHA2-256SUMS
sha256sum artifact/yt-dlp_x86.exe | awk '{print $1 " yt-dlp_x86.exe"}' >> SHA2-256SUMS
sha256sum artifact/yt-dlp_macos | awk '{print $1 " yt-dlp_macos"}' >> SHA2-256SUMS
sha256sum artifact/yt-dlp_macos.zip | awk '{print $1 " yt-dlp_macos.zip"}' >> SHA2-256SUMS
sha256sum artifact/yt-dlp_macos_legacy | awk '{print $1 " yt-dlp_macos_legacy"}' >> SHA2-256SUMS
sha256sum artifact/dist/yt-dlp_linux | awk '{print $1 " yt-dlp_linux"}' >> SHA2-256SUMS
sha256sum artifact/dist/yt-dlp_linux.zip | awk '{print $1 " yt-dlp_linux.zip"}' >> SHA2-256SUMS
sha512sum artifact/yt-dlp | awk '{print $1 " yt-dlp"}' >> SHA2-512SUMS
sha512sum artifact/yt-dlp.tar.gz | awk '{print $1 " yt-dlp.tar.gz"}' >> SHA2-512SUMS
sha512sum artifact/yt-dlp.exe | awk '{print $1 " yt-dlp.exe"}' >> SHA2-512SUMS
sha512sum artifact/yt-dlp_win.zip | awk '{print $1 " yt-dlp_win.zip"}' >> SHA2-512SUMS
sha512sum artifact/yt-dlp_min.exe | awk '{print $1 " yt-dlp_min.exe"}' >> SHA2-512SUMS
sha512sum artifact/yt-dlp_x86.exe | awk '{print $1 " yt-dlp_x86.exe"}' >> SHA2-512SUMS
sha512sum artifact/yt-dlp_macos | awk '{print $1 " yt-dlp_macos"}' >> SHA2-512SUMS
sha512sum artifact/yt-dlp_macos.zip | awk '{print $1 " yt-dlp_macos.zip"}' >> SHA2-512SUMS
sha512sum artifact/yt-dlp_macos_legacy | awk '{print $1 " yt-dlp_macos_legacy"}' >> SHA2-512SUMS
sha512sum artifact/dist/yt-dlp_linux | awk '{print $1 " yt-dlp_linux"}' >> SHA2-512SUMS
sha512sum artifact/dist/yt-dlp_linux.zip | awk '{print $1 " yt-dlp_linux.zip"}' >> SHA2-512SUMS
- name: Publish Release
uses: yt-dlp/action-gh-release@v1
with:
tag_name: ${{ needs.prepare.outputs.ytdlp_version }}
name: yt-dlp ${{ needs.prepare.outputs.ytdlp_version }}
target_commitish: ${{ needs.prepare.outputs.head_sha }}
body: |
#### [A description of the various files]((https://github.com/yt-dlp/yt-dlp#release-files)) are in the README
---
<details open><summary><h3>Changelog</summary>
<p>
${{ env.changelog }}
</p>
</details>
files: |
SHA2-256SUMS
SHA2-512SUMS
artifact/yt-dlp
artifact/yt-dlp.tar.gz
artifact/yt-dlp.exe
artifact/yt-dlp_win.zip
artifact/yt-dlp_min.exe
artifact/yt-dlp_x86.exe
artifact/yt-dlp_macos
artifact/yt-dlp_macos.zip
artifact/yt-dlp_macos_legacy
artifact/dist/yt-dlp_linux
artifact/dist/yt-dlp_linux.zip
_update_spec

View File

@ -21,9 +21,9 @@ jobs:
python-version: pypy-3.9 python-version: pypy-3.9
run-tests-ext: bat run-tests-ext: bat
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }} - name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2 uses: actions/setup-python@v4
with: with:
python-version: ${{ matrix.python-version }} python-version: ${{ matrix.python-version }}
- name: Install pytest - name: Install pytest

View File

@ -6,9 +6,9 @@ jobs:
if: "contains(github.event.head_commit.message, 'ci run dl')" if: "contains(github.event.head_commit.message, 'ci run dl')"
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v3
- name: Set up Python - name: Set up Python
uses: actions/setup-python@v2 uses: actions/setup-python@v4
with: with:
python-version: 3.9 python-version: 3.9
- name: Install test requirements - name: Install test requirements
@ -36,9 +36,9 @@ jobs:
python-version: pypy-3.9 python-version: pypy-3.9
run-tests-ext: bat run-tests-ext: bat
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }} - name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2 uses: actions/setup-python@v4
with: with:
python-version: ${{ matrix.python-version }} python-version: ${{ matrix.python-version }}
- name: Install pytest - name: Install pytest

View File

@ -6,9 +6,9 @@ jobs:
if: "!contains(github.event.head_commit.message, 'ci skip all')" if: "!contains(github.event.head_commit.message, 'ci skip all')"
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v3
- name: Set up Python - name: Set up Python
uses: actions/setup-python@v2 uses: actions/setup-python@v4
with: with:
python-version: 3.9 python-version: 3.9
- name: Install test requirements - name: Install test requirements
@ -20,9 +20,9 @@ jobs:
if: "!contains(github.event.head_commit.message, 'ci skip all')" if: "!contains(github.event.head_commit.message, 'ci skip all')"
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v3
- name: Set up Python - name: Set up Python
uses: actions/setup-python@v2 uses: actions/setup-python@v4
with: with:
python-version: 3.9 python-version: 3.9
- name: Install flake8 - name: Install flake8

5
.gitignore vendored
View File

@ -33,13 +33,14 @@ cookies
*.jpeg *.jpeg
*.jpg *.jpg
*.m4a *.m4a
*.mpga
*.m4v *.m4v
*.mhtml *.mhtml
*.mkv *.mkv
*.mov *.mov
*.mp3 *.mp3
*.mp4 *.mp4
*.mpga
*.oga
*.ogg *.ogg
*.opus *.opus
*.png *.png
@ -47,6 +48,7 @@ cookies
*.srt *.srt
*.swf *.swf
*.swp *.swp
*.tt
*.ttml *.ttml
*.url *.url
*.vtt *.vtt
@ -85,6 +87,7 @@ updates_key.pem
.tox .tox
*.class *.class
*.isorted *.isorted
*.stackdump
# Generated # Generated
AUTHORS AUTHORS

View File

@ -161,7 +161,7 @@ ## Adding new feature or making overarching changes
## Adding support for a new site ## Adding support for a new site
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](https://www.github.com/ytdl-org/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. yt-dlp does **not support** such sites thus pull requests adding support for them **will be rejected**. If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](#is-the-website-primarily-used-for-piracy)**. yt-dlp does **not support** such sites thus pull requests adding support for them **will be rejected**.
After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`): After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`):
@ -195,7 +195,7 @@ ## Adding support for a new site
# * A value # * A value
# * MD5 checksum; start the string with md5: # * MD5 checksum; start the string with md5:
# * A regular expression; start the string with re: # * A regular expression; start the string with re:
# * Any Python type (for example int or float) # * Any Python type, e.g. int or float
} }
}] }]
@ -261,7 +261,7 @@ ### Mandatory and optional metafields
For pornographic sites, appropriate `age_limit` must also be returned. For pornographic sites, appropriate `age_limit` must also be returned.
The extractor is allowed to return the info dict without url or formats in some special cases if it allows the user to extract usefull information with `--ignore-no-formats-error` - Eg: when the video is a live stream that has not started yet. The extractor is allowed to return the info dict without url or formats in some special cases if it allows the user to extract usefull information with `--ignore-no-formats-error` - e.g. when the video is a live stream that has not started yet.
[Any field](yt_dlp/extractor/common.py#219-L426) apart from the aforementioned ones are considered **optional**. That means that extraction should be **tolerant** to situations when sources for these fields can potentially be unavailable (even if they are always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields. [Any field](yt_dlp/extractor/common.py#219-L426) apart from the aforementioned ones are considered **optional**. That means that extraction should be **tolerant** to situations when sources for these fields can potentially be unavailable (even if they are always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields.

View File

@ -285,3 +285,49 @@ odo2063
pritam20ps05 pritam20ps05
scy scy
sheerluck sheerluck
AxiosDeminence
DjesonPV
eren-kemer
freezboltz
Galiley
haobinliang
Mehavoid
winterbird-code
yashkc2025
aldoridhoni
bashonly
jacobtruman
masta79
palewire
cgrigis
DavidH-2022
dfaker
jackyyf
ohaiibuzzle
SamantazFox
shreyasminocha
tejasa97
xenov
satan1st
0xGodspeed
5736d79
587021c
basrieter
Bobscorn
CNugteren
columndeeply
DoubleCouponDay
Fabi019
GautamMKGarg
Grub4K
itachi-19
jeroenj
josanabr
LiviaMedeiros
nikita-moor
snapdgn
SuperSonicHub1
tannertechnology
Timendum
tobi1805
TokyoBlackHole

View File

@ -11,6 +11,293 @@ # Instuctions for creating release
--> -->
### 2022.10.04
* Allow a `set` to be passed as `download_archive` by [pukkandan](https://github.com/pukkandan), [bashonly](https://github.com/bashonly)
* Allow open ranges for time ranges by [Lesmiscore](https://github.com/Lesmiscore)
* Allow plugin extractors to replace the built-in ones
* Don't download entire video when no matching `--download-sections`
* Fix `--config-location -`
* Improve [5736d79](https://github.com/yt-dlp/yt-dlp/pull/5044/commits/5736d79172c47ff84740d5720467370a560febad)
* Fix for when playlists don't have `webpage_url`
* Support environment variables in `--ffmpeg-location`
* Workaround `libc_ver` not be available on Windows Store version of Python
* [outtmpl] Curly braces to filter keys by [pukkandan](https://github.com/pukkandan)
* [outtmpl] Make `%s` work in strfformat for all systems
* [jsinterp] Workaround operator associativity issue
* [cookies] Let `_get_mac_keyring_password` fail gracefully
* [cookies] Parse cookies leniently by [Grub4K](https://github.com/Grub4K)
* [phantomjs] Fix bug in [587021c](https://github.com/yt-dlp/yt-dlp/commit/587021cd9f717181b44e881941aca3f8d753758b) by [elyse0](https://github.com/elyse0)
* [downloader/aria2c] Fix filename containing leading whitespace by [std-move](https://github.com/std-move)
* [downloader/ism] Support ec-3 codec by [nixxo](https://github.com/nixxo)
* [extractor] Fix `fatal=False` in `RetryManager`
* [extractor] Improve json-ld extraction
* [extractor] Make `_search_json` able to parse lists
* [extractor] Escape `%` in `representation_id` of m3u8
* [extractor/generic] Pass through referer from json-ld
* [utils] `base_url`: URL paths can contain `&` by [elyse0](https://github.com/elyse0)
* [utils] `js_to_json`: Improve
* [utils] `Popen.run`: Fix default return in binary mode
* [utils] `traverse_obj`: Rewrite, document and add tests by [Grub4K](https://github.com/Grub4K)
* [devscripts] `make_lazy_extractors`: Fix for Docker by [josanabr](https://github.com/josanabr)
* [docs] Misc Improvements
* [cleanup] Misc fixes and cleanup by [pukkandan](https://github.com/pukkandan), [gamer191](https://github.com/gamer191)
* [extractor/24tv.ua] Add extractors by [coletdjnz](https://github.com/coletdjnz)
* [extractor/BerufeTV] Add extractor by [Fabi019](https://github.com/Fabi019)
* [extractor/booyah] Add extractor by [HobbyistDev](https://github.com/HobbyistDev), [elyse0](https://github.com/elyse0)
* [extractor/bundesliga] Add extractor by [Fabi019](https://github.com/Fabi019)
* [extractor/GoPlay] Add extractor by [CNugteren](https://github.com/CNugteren), [basrieter](https://github.com/basrieter), [jeroenj](https://github.com/jeroenj)
* [extractor/iltalehti] Add extractor by [tpikonen](https://github.com/tpikonen)
* [extractor/IsraelNationalNews] Add extractor by [Bobscorn](https://github.com/Bobscorn)
* [extractor/mediaworksnzvod] Add extractor by [coletdjnz](https://github.com/coletdjnz)
* [extractor/MicrosoftEmbed] Add extractor by [DoubleCouponDay](https://github.com/DoubleCouponDay)
* [extractor/nbc] Add NBCStations extractor by [bashonly](https://github.com/bashonly)
* [extractor/onenewsnz] Add extractor by [coletdjnz](https://github.com/coletdjnz)
* [extractor/prankcast] Add extractor by [HobbyistDev](https://github.com/HobbyistDev), [columndeeply](https://github.com/columndeeply)
* [extractor/Smotrim] Add extractor by [Lesmiscore](https://github.com/Lesmiscore), [nikita-moor](https://github.com/nikita-moor)
* [extractor/tencent] Add Iflix extractor by [elyse0](https://github.com/elyse0)
* [extractor/unscripted] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/adobepass] Add MSO AlticeOne (Optimum TV) by [CplPwnies](https://github.com/CplPwnies)
* [extractor/youtube] **Download `post_live` videos from start** by [Lesmiscore](https://github.com/Lesmiscore), [pukkandan](https://github.com/pukkandan)
* [extractor/youtube] Add support for Shorts audio pivot feed by [coletdjnz](https://github.com/coletdjnz), [pukkandan](https://github.com/pukkandan)
* [extractor/youtube] Detect `lazy-load-for-videos` embeds
* [extractor/youtube] Do not warn on duplicate chapters
* [extractor/youtube] Fix video like count extraction by [coletdjnz](https://github.com/coletdjnz)
* [extractor/youtube] Support changing extraction language by [coletdjnz](https://github.com/coletdjnz)
* [extractor/youtube:tab] Improve continuation items extraction
* [extractor/youtube:tab] Support `reporthistory` page
* [extractor/amazonstore] Fix JSON extraction by [coletdjnz](https://github.com/coletdjnz), [pukkandan](https://github.com/pukkandan)
* [extractor/amazonstore] Retry to avoid captcha page by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/animeondemand] Remove extractor by [TokyoBlackHole](https://github.com/TokyoBlackHole)
* [extractor/anvato] Fix extractor and refactor by [bashonly](https://github.com/bashonly)
* [extractor/artetv] Remove duplicate stream urls by [Grub4K](https://github.com/Grub4K)
* [extractor/audioboom] Support direct URLs and refactor by [pukkandan](https://github.com/pukkandan), [tpikonen](https://github.com/tpikonen)
* [extractor/bandcamp] Extract `uploader_url`
* [extractor/bilibili] Add space.bilibili extractors by [lockmatrix](https://github.com/lockmatrix)
* [extractor/BilibiliSpace] Fix extractor and better error message by [lockmatrix](https://github.com/lockmatrix)
* [extractor/BiliIntl] Support uppercase lang in `_VALID_URL` by [coletdjnz](https://github.com/coletdjnz)
* [extractor/BiliIntlSeries] Fix `_VALID_URL`
* [extractor/bongacams] Update `_VALID_URL` by [0xGodspeed](https://github.com/0xGodspeed)
* [extractor/crunchyroll:beta] Improve handling of hardsubs by [Grub4K](https://github.com/Grub4K)
* [extractor/detik] Generalize extractors by [HobbyistDev](https://github.com/HobbyistDev), [coletdjnz](https://github.com/coletdjnz)
* [extractor/dplay:italy] Add default authentication by [Timendum](https://github.com/Timendum)
* [extractor/heise] Fix extractor by [coletdjnz](https://github.com/coletdjnz)
* [extractor/holodex] Fix `_VALID_URL` by [LiviaMedeiros](https://github.com/LiviaMedeiros)
* [extractor/hrfensehen] Fix extractor by [snapdgn](https://github.com/snapdgn)
* [extractor/hungama] Add subtitle by [GautamMKGarg](https://github.com/GautamMKGarg), [pukkandan](https://github.com/pukkandan)
* [extractor/instagram] Extract more metadata by [pritam20ps05](https://github.com/pritam20ps05)
* [extractor/JWPlatform] Fix extractor by [coletdjnz](https://github.com/coletdjnz)
* [extractor/malltv] Fix video_id extraction by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/MLBTV] Detect live streams
* [extractor/motorsport] Support native embeds
* [extractor/Mxplayer] Fix extractor by [itachi-19](https://github.com/itachi-19)
* [extractor/nebula] Add nebula.tv by [tannertechnology](https://github.com/tannertechnology)
* [extractor/nfl] Fix extractor by [bashonly](https://github.com/bashonly)
* [extractor/ondemandkorea] Update `jw_config` regex by [julien-hadleyjack](https://github.com/julien-hadleyjack)
* [extractor/paramountplus] Better DRM detection by [bashonly](https://github.com/bashonly)
* [extractor/patreon] Sort formats
* [extractor/rcs] Fix embed extraction by [coletdjnz](https://github.com/coletdjnz)
* [extractor/redgifs] Fix extractor by [jhwgh1968](https://github.com/jhwgh1968)
* [extractor/rutube] Fix `_EMBED_REGEX` by [coletdjnz](https://github.com/coletdjnz)
* [extractor/RUTV] Fix warnings for livestreams by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/soundcloud:search] More metadata in `--flat-playlist` by [SuperSonicHub1](https://github.com/SuperSonicHub1)
* [extractor/telegraaf] Use mobile GraphQL API endpoint by [coletdjnz](https://github.com/coletdjnz)
* [extractor/tennistv] Fix timestamp by [zenerdi0de](https://github.com/zenerdi0de)
* [extractor/tiktok] Fix TikTokIE by [bashonly](https://github.com/bashonly)
* [extractor/triller] Fix auth token by [bashonly](https://github.com/bashonly)
* [extractor/trovo] Fix extractors by [Mehavoid](https://github.com/Mehavoid)
* [extractor/tv2] Support new url format by [tobi1805](https://github.com/tobi1805)
* [extractor/web.archive:youtube] Fix `_YT_INITIAL_PLAYER_RESPONSE_RE`
* [extractor/wistia] Add support for channels by [coletdjnz](https://github.com/coletdjnz)
* [extractor/wistia] Match IDs in embed URLs by [bashonly](https://github.com/bashonly)
* [extractor/wordpress:playlist] Add generic embed extractor by [coletdjnz](https://github.com/coletdjnz)
* [extractor/yandexvideopreview] Update `_VALID_URL` by [Grub4K](https://github.com/Grub4K)
* [extractor/zee5] Fix `_VALID_URL` by [m4tu4g](https://github.com/m4tu4g)
* [extractor/zee5] Generate device ids by [freezboltz](https://github.com/freezboltz)
### 2022.09.01
* Add option `--use-extractors`
* Merge youtube-dl: Upto [commit/ed5c44e](https://github.com/ytdl-org/youtube-dl/commit/ed5c44e7)
* Add yt-dlp version to infojson
* Fix `--break-per-url --max-downloads`
* Fix bug in `--alias`
* [cookies] Support firefox container in `--cookies-from-browser` by [bashonly](https://github.com/bashonly), [coletdjnz](https://github.com/coletdjnz), [pukkandan](https://github.com/pukkandan)
* [downloader/external] Smarter detection of executable
* [extractor/generic] Don't return JW player without formats
* [FormatSort] Fix `aext` for `--prefer-free-formats`
* [jsinterp] Various improvements by [pukkandan](https://github.com/pukkandan), [dirkf](https://github.com/dirkf), [elyse0](https://github.com/elyse0)
* [cache] Mechanism to invalidate old cache
* [utils] Add `deprecation_warning`
* [utils] Add `orderedSet_from_options`
* [utils] `Popen`: Restore `LD_LIBRARY_PATH` when using PyInstaller by [Lesmiscore](https://github.com/Lesmiscore)
* [build] `make tar` should not follow `DESTDIR` by [satan1st](https://github.com/satan1st)
* [build] Update pyinstaller by [shirt-dev](https://github.com/shirt-dev)
* [test] Fix `test_youtube_signature`
* [cleanup] Misc fixes and cleanup by [DavidH-2022](https://github.com/DavidH-2022), [MrRawes](https://github.com/MrRawes), [pukkandan](https://github.com/pukkandan)
* [extractor/epoch] Add extractor by [tejasa97](https://github.com/tejasa97)
* [extractor/eurosport] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/IslamChannel] Add extractors by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/newspicks] Add extractor by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/triller] Add extractor by [bashonly](https://github.com/bashonly)
* [extractor/VQQ] Add extractors by [elyse0](https://github.com/elyse0)
* [extractor/youtube] Improvements to nsig extraction
* [extractor/youtube] Fix bug in format sorting
* [extractor/youtube] Update iOS Innertube clients by [SamantazFox](https://github.com/SamantazFox)
* [extractor/youtube] Use device-specific user agent by [coletdjnz](https://github.com/coletdjnz)
* [extractor/youtube] Add `--compat-option no-youtube-prefer-utc-upload-date` by [coletdjnz](https://github.com/coletdjnz)
* [extractor/arte] Bug fix by [cgrigis](https://github.com/cgrigis)
* [extractor/bilibili] Extract `flac` with premium account by [jackyyf](https://github.com/jackyyf)
* [extractor/BiliBiliSearch] Don't sort by date
* [extractor/BiliBiliSearch] Fix infinite loop
* [extractor/bitchute] Mark errors as expected
* [extractor/crunchyroll:beta] Use anonymous access by [tejing1](https://github.com/tejing1)
* [extractor/huya] Fix stream extraction by [ohaiibuzzle](https://github.com/ohaiibuzzle)
* [extractor/medaltv] Fix extraction by [xenova](https://github.com/xenova)
* [extractor/mediaset] Fix embed extraction
* [extractor/mixcloud] All formats are audio-only
* [extractor/rtbf] Fix jwt extraction by [elyse0](https://github.com/elyse0)
* [extractor/screencastomatic] Support `--video-password` by [shreyasminocha](https://github.com/shreyasminocha)
* [extractor/stripchat] Don't modify input URL by [dfaker](https://github.com/dfaker)
* [extractor/uktv] Improve `_VALID_URL` by [dirkf](https://github.com/dirkf)
* [extractor/vimeo:user] Fix `_VALID_URL`
### 2022.08.19
* Fix bug in `--download-archive`
* [jsinterp] **Fix for new youtube players** and related improvements by [dirkf](https://github.com/dirkf), [pukkandan](https://github.com/pukkandan)
* [phantomjs] Add function to execute JS without a DOM by [MinePlayersPE](https://github.com/MinePlayersPE), [pukkandan](https://github.com/pukkandan)
* [build] Exclude devscripts from installs by [Lesmiscore](https://github.com/Lesmiscore)
* [cleanup] Misc fixes and cleanup
* [extractor/youtube] **Add fallback to phantomjs** for nsig
* [extractor/youtube] Fix error reporting of "Incomplete data"
* [extractor/youtube] Improve format sorting for IOS formats
* [extractor/youtube] Improve signature caching
* [extractor/instagram] Fix extraction by [bashonly](https://github.com/bashonly), [pritam20ps05](https://github.com/pritam20ps05)
* [extractor/rai] Minor fix by [nixxo](https://github.com/nixxo)
* [extractor/rtbf] Fix stream extractor by [elyse0](https://github.com/elyse0)
* [extractor/SovietsCloset] Fix extractor by [ChillingPepper](https://github.com/ChillingPepper)
* [extractor/zattoo] Fix Zattoo resellers by [goggle](https://github.com/goggle)
### 2022.08.14
* Merge youtube-dl: Upto [commit/d231b56](https://github.com/ytdl-org/youtube-dl/commit/d231b56)
* [jsinterp] Handle **new youtube signature functions**
* [jsinterp] Truncate error messages
* [extractor] Fix format sorting of `channels`
* [ffmpeg] Disable avconv unless `--prefer-avconv`
* [ffmpeg] Smarter detection of ffprobe filename
* [embedthumbnail] Detect `libatomicparsley.so`
* [ThumbnailsConvertor] Fix conversion after `fixup_webp`
* [utils] Fix `get_compatible_ext`
* [build] Fix changelog
* [update] Set executable bit-mask by [pukkandan](https://github.com/pukkandan), [Lesmiscore](https://github.com/Lesmiscore)
* [devscripts] Fix import
* [docs] Consistent use of `e.g.` by [Lesmiscore](https://github.com/Lesmiscore)
* [cleanup] Misc fixes and cleanup
* [extractor/moview] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/parler] Add extractor by [palewire](https://github.com/palewire)
* [extractor/patreon] Ignore erroneous media attachments by [coletdjnz](https://github.com/coletdjnz)
* [extractor/truth] Add extractor by [palewire](https://github.com/palewire)
* [extractor/aenetworks] Add formats parameter by [jacobtruman](https://github.com/jacobtruman)
* [extractor/crunchyroll] Improve `_VALID_URL`s
* [extractor/doodstream] Add `wf` domain by [aldoridhoni](https://github.com/aldoridhoni)
* [extractor/facebook] Add reel support by [bashonly](https://github.com/bashonly)
* [extractor/MLB] New extractor by [ischmidt20](https://github.com/ischmidt20)
* [extractor/rai] Misc fixes by [nixxo](https://github.com/nixxo)
* [extractor/toggo] Improve `_VALID_URL` by [masta79](https://github.com/masta79)
* [extractor/tubitv] Extract additional formats by [shirt-dev](https://github.com/shirt-dev)
* [extractor/zattoo] Potential fix for resellers
### 2022.08.08
* **Remove Python 3.6 support**
* Determine merge container better by [pukkandan](https://github.com/pukkandan), [selfisekai](https://github.com/selfisekai)
* Framework for embed detection by [coletdjnz](https://github.com/coletdjnz), [pukkandan](https://github.com/pukkandan)
* Merge youtube-dl: Upto [commit/adb5294](https://github.com/ytdl-org/youtube-dl/commit/adb5294)
* `--compat-option no-live-chat` should disable danmaku
* Fix misleading DRM message
* Import ctypes only when necessary
* Minor bugfixes
* Reject entire playlists faster with `--match-filter`
* Remove filtered entries from `-J`
* Standardize retry mechanism
* Validate `--merge-output-format`
* [downloader] Add average speed to final progress line
* [extractor] Add field `audio_channels`
* [extractor] Support multiple archive ids for one video
* [ffmpeg] Set `ffmpeg_location` in a contextvar
* [FFmpegThumbnailsConvertor] Fix conversion from GIF
* [MetadataParser] Don't set `None` when the field didn't match
* [outtmpl] Smarter replacing of unsupported characters
* [outtmpl] Treat empty values as None in filenames
* [utils] sanitize_open: Allow any IO stream as stdout
* [build, devscripts] Add devscript to set a build variant
* [build] Improve build process by [shirt-dev](https://github.com/shirt-dev)
* [build] Update pyinstaller
* [devscripts] Create `utils` and refactor
* [docs] Clarify `best*`
* [docs] Fix bug report issue template
* [docs] Fix capitalization in references by [christoph-heinrich](https://github.com/christoph-heinrich)
* [cleanup, mhtml] Use imghdr
* [cleanup, utils] Consolidate known media extensions
* [cleanup] Misc fixes and cleanup
* [extractor/angel] Add extractor by [AxiosDeminence](https://github.com/AxiosDeminence)
* [extractor/dplay] Add MotorTrend extractor by [Sipherdrakon](https://github.com/Sipherdrakon)
* [extractor/harpodeon] Add extractor by [eren-kemer](https://github.com/eren-kemer)
* [extractor/holodex] Add extractor by [pukkandan](https://github.com/pukkandan), [sqrtNOT](https://github.com/sqrtNOT)
* [extractor/kompas] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/rai] Add raisudtirol extractor by [nixxo](https://github.com/nixxo)
* [extractor/tempo] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/youtube] **Fixes for third party client detection** by [coletdjnz](https://github.com/coletdjnz)
* [extractor/youtube] Add `live_status=post_live` by [lazypete365](https://github.com/lazypete365)
* [extractor/youtube] Extract more format info
* [extractor/youtube] Parse translated subtitles only when requested
* [extractor/youtube, extractor/twitch] Allow waiting for channels to become live
* [extractor/youtube, webvtt] Extract auto-subs from livestream VODs by [fstirlitz](https://github.com/fstirlitz), [pukkandan](https://github.com/pukkandan)
* [extractor/AbemaTVTitle] Implement paging by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/archiveorg] Improve handling of formats by [coletdjnz](https://github.com/coletdjnz), [pukkandan](https://github.com/pukkandan)
* [extractor/arte] Fix title extraction
* [extractor/arte] **Move to v2 API** by [fstirlitz](https://github.com/fstirlitz), [pukkandan](https://github.com/pukkandan)
* [extractor/bbc] Fix news articles by [ajj8](https://github.com/ajj8)
* [extractor/camtasia] Separate into own extractor by [coletdjnz](https://github.com/coletdjnz)
* [extractor/cloudflarestream] Fix video_id padding by [haobinliang](https://github.com/haobinliang)
* [extractor/crunchyroll] Fix conversion of thumbnail from GIF
* [extractor/crunchyroll] Handle missing metadata correctly by [Burve](https://github.com/Burve), [pukkandan](https://github.com/pukkandan)
* [extractor/crunchyroll:beta] Extract timestamp and fix tests by [tejing1](https://github.com/tejing1)
* [extractor/crunchyroll:beta] Use streams API by [tejing1](https://github.com/tejing1)
* [extractor/doodstream] Support more domains by [Galiley](https://github.com/Galiley)
* [extractor/ESPN] Extract duration by [ischmidt20](https://github.com/ischmidt20)
* [extractor/FIFA] Change API endpoint by [Bricio](https://github.com/Bricio), [yashkc2025](https://github.com/yashkc2025)
* [extractor/globo:article] Remove false positives by [Bricio](https://github.com/Bricio)
* [extractor/Go] Extract timestamp by [ischmidt20](https://github.com/ischmidt20)
* [extractor/hidive] Fix cookie login when netrc is also given by [winterbird-code](https://github.com/winterbird-code)
* [extractor/html5] Separate into own extractor by [coletdjnz](https://github.com/coletdjnz), [pukkandan](https://github.com/pukkandan)
* [extractor/ina] Improve extractor by [elyse0](https://github.com/elyse0)
* [extractor/NaverNow] Change endpoint by [ping](https://github.com/ping)
* [extractor/ninegag] Extract uploader by [DjesonPV](https://github.com/DjesonPV)
* [extractor/NovaPlay] Fix extractor by [Bojidarist](https://github.com/Bojidarist)
* [extractor/orf:radio] Rewrite extractors
* [extractor/patreon] Fix and improve extractors by [coletdjnz](https://github.com/coletdjnz), [pukkandan](https://github.com/pukkandan)
* [extractor/rai] Fix RaiNews extraction by [nixxo](https://github.com/nixxo)
* [extractor/redbee] Unify and update extractors by [elyse0](https://github.com/elyse0)
* [extractor/stripchat] Fix _VALID_URL by [freezboltz](https://github.com/freezboltz)
* [extractor/tubi] Exclude playlists from playlist entries by [sqrtNOT](https://github.com/sqrtNOT)
* [extractor/tviplayer] Improve `_VALID_URL` by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/twitch] Extract chapters for single chapter VODs by [mpeter50](https://github.com/mpeter50)
* [extractor/vgtv] Support tv.vg.no by [sqrtNOT](https://github.com/sqrtNOT)
* [extractor/vidio] Support embed link by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/vk] Fix extractor by [Mehavoid](https://github.com/Mehavoid)
* [extractor/WASDTV:record] Fix `_VALID_URL`
* [extractor/xfileshare] Add Referer by [Galiley](https://github.com/Galiley)
* [extractor/YahooJapanNews] Fix extractor by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/yandexmusic] Extract higher quality format
* [extractor/zee5] Update Device ID by [m4tu4g](https://github.com/m4tu4g)
### 2022.07.18 ### 2022.07.18
* Allow users to specify encoding in each config files by [Lesmiscore](https://github.com/Lesmiscore) * Allow users to specify encoding in each config files by [Lesmiscore](https://github.com/Lesmiscore)
@ -125,7 +412,7 @@ ### 2022.06.22
* [**Deprecate support for Python 3.6**](https://github.com/yt-dlp/yt-dlp/issues/3764#issuecomment-1154051119) * [**Deprecate support for Python 3.6**](https://github.com/yt-dlp/yt-dlp/issues/3764#issuecomment-1154051119)
* **Add option `--download-sections` to download video partially** * **Add option `--download-sections` to download video partially**
* Chapter regex and time ranges are accepted (Eg: `--download-sections *1:10-2:20`) * Chapter regex and time ranges are accepted, e.g. `--download-sections *1:10-2:20`
* Add option `--alias` * Add option `--alias`
* Add option `--lazy-playlist` to process entries as they are received * Add option `--lazy-playlist` to process entries as they are received
* Add option `--retry-sleep` * Add option `--retry-sleep`
@ -1289,7 +1576,7 @@ ### 2021.09.25
* Add new option `--netrc-location` * Add new option `--netrc-location`
* [outtmpl] Allow alternate fields using `,` * [outtmpl] Allow alternate fields using `,`
* [outtmpl] Add format type `B` to treat the value as bytes (eg: to limit the filename to a certain number of bytes) * [outtmpl] Add format type `B` to treat the value as bytes, e.g. to limit the filename to a certain number of bytes
* Separate the options `--ignore-errors` and `--no-abort-on-error` * Separate the options `--ignore-errors` and `--no-abort-on-error`
* Basic framework for simultaneous download of multiple formats by [nao20010128nao](https://github.com/nao20010128nao) * Basic framework for simultaneous download of multiple formats by [nao20010128nao](https://github.com/nao20010128nao)
* [17live] Add 17.live extractor by [nao20010128nao](https://github.com/nao20010128nao) * [17live] Add 17.live extractor by [nao20010128nao](https://github.com/nao20010128nao)
@ -1679,7 +1966,7 @@ ### 2021.07.07
* Merge youtube-dl: Upto [commit/a803582](https://github.com/ytdl-org/youtube-dl/commit/a8035827177d6b59aca03bd717acb6a9bdd75ada) * Merge youtube-dl: Upto [commit/a803582](https://github.com/ytdl-org/youtube-dl/commit/a8035827177d6b59aca03bd717acb6a9bdd75ada)
* Add `--extractor-args` to pass some extractor-specific arguments. See [readme](https://github.com/yt-dlp/yt-dlp#extractor-arguments) * Add `--extractor-args` to pass some extractor-specific arguments. See [readme](https://github.com/yt-dlp/yt-dlp#extractor-arguments)
* Add extractor option `skip` for `youtube`. Eg: `--extractor-args youtube:skip=hls,dash` * Add extractor option `skip` for `youtube`, e.g. `--extractor-args youtube:skip=hls,dash`
* Deprecates `--youtube-skip-dash-manifest`, `--youtube-skip-hls-manifest`, `--youtube-include-dash-manifest`, `--youtube-include-hls-manifest` * Deprecates `--youtube-skip-dash-manifest`, `--youtube-skip-hls-manifest`, `--youtube-include-dash-manifest`, `--youtube-include-hls-manifest`
* Allow `--list...` options to work with `--print`, `--quiet` and other `--list...` options * Allow `--list...` options to work with `--print`, `--quiet` and other `--list...` options
* [youtube] Use `player` API for additional video extraction requests by [coletdjnz](https://github.com/coletdjnz) * [youtube] Use `player` API for additional video extraction requests by [coletdjnz](https://github.com/coletdjnz)

View File

@ -28,12 +28,12 @@ ## [coletdjnz](https://github.com/coletdjnz)
[![gh-sponsor](https://img.shields.io/badge/_-Sponsor-red.svg?logo=githubsponsors&labelColor=555555&style=for-the-badge)](https://github.com/sponsors/coletdjnz) [![gh-sponsor](https://img.shields.io/badge/_-Sponsor-red.svg?logo=githubsponsors&labelColor=555555&style=for-the-badge)](https://github.com/sponsors/coletdjnz)
* YouTube improvements including: age-gate bypass, private playlists, multiple-clients (to avoid throttling) and a lot of under-the-hood improvements * YouTube improvements including: age-gate bypass, private playlists, multiple-clients (to avoid throttling) and a lot of under-the-hood improvements
* Added support for downloading YoutubeWebArchive videos * Added support for new websites YoutubeWebArchive, MainStreaming, PRX, nzherald, Mediaklikk, StarTV etc
* Added support for new websites MainStreaming, PRX, nzherald, etc * Improved/fixed support for Patreon, panopto, gfycat, itv, pbs, SouthParkDE etc
## [Ashish0804](https://github.com/Ashish0804) ## [Ashish0804](https://github.com/Ashish0804) <sub><sup>[Inactive]</sup></sub>
[![ko-fi](https://img.shields.io/badge/_-Ko--fi-red.svg?logo=kofi&labelColor=555555&style=for-the-badge)](https://ko-fi.com/ashish0804) [![ko-fi](https://img.shields.io/badge/_-Ko--fi-red.svg?logo=kofi&labelColor=555555&style=for-the-badge)](https://ko-fi.com/ashish0804)
@ -48,4 +48,5 @@ ## [Lesmiscore](https://github.com/Lesmiscore) (nao20010128nao)
**Monacoin**: mona1q3tf7dzvshrhfe3md379xtvt2n22duhglv5dskr **Monacoin**: mona1q3tf7dzvshrhfe3md379xtvt2n22duhglv5dskr
* Download live from start to end for YouTube * Download live from start to end for YouTube
* Added support for new websites mildom, PixivSketch, skeb, radiko, voicy, mirrativ, openrec, whowatch, damtomo, 17.live, mixch etc * Added support for new websites AbemaTV, mildom, PixivSketch, skeb, radiko, voicy, mirrativ, openrec, whowatch, damtomo, 17.live, mixch etc
* Improved/fixed support for fc2, YahooJapanNews, tver, iwara etc

View File

@ -17,8 +17,8 @@ pypi-files: AUTHORS Changelog.md LICENSE README.md README.txt supportedsites \
clean-test: clean-test:
rm -rf test/testdata/sigs/player-*.js tmp/ *.annotations.xml *.aria2 *.description *.dump *.frag \ rm -rf test/testdata/sigs/player-*.js tmp/ *.annotations.xml *.aria2 *.description *.dump *.frag \
*.frag.aria2 *.frag.urls *.info.json *.live_chat.json *.meta *.part* *.tmp *.temp *.unknown_video *.ytdl \ *.frag.aria2 *.frag.urls *.info.json *.live_chat.json *.meta *.part* *.tmp *.temp *.unknown_video *.ytdl \
*.3gp *.ape *.ass *.avi *.desktop *.f4v *.flac *.flv *.jpeg *.jpg *.m4a *.mpga *.m4v *.mhtml *.mkv *.mov \ *.3gp *.ape *.ass *.avi *.desktop *.f4v *.flac *.flv *.jpeg *.jpg *.m4a *.m4v *.mhtml *.mkv *.mov *.mp3 *.mp4 \
*.mp3 *.mp4 *.ogg *.opus *.png *.sbv *.srt *.swf *.swp *.ttml *.url *.vtt *.wav *.webloc *.webm *.webp *.mpga *.oga *.ogg *.opus *.png *.sbv *.srt *.swf *.swp *.tt *.ttml *.url *.vtt *.wav *.webloc *.webm *.webp
clean-dist: clean-dist:
rm -rf yt-dlp.1.temp.md yt-dlp.1 README.txt MANIFEST build/ dist/ .coverage cover/ yt-dlp.tar.gz completions/ \ rm -rf yt-dlp.1.temp.md yt-dlp.1 README.txt MANIFEST build/ dist/ .coverage cover/ yt-dlp.tar.gz completions/ \
yt_dlp/extractor/lazy_extractors.py *.spec CONTRIBUTING.md.tmp yt-dlp yt-dlp.exe yt_dlp.egg-info/ AUTHORS .mailmap yt_dlp/extractor/lazy_extractors.py *.spec CONTRIBUTING.md.tmp yt-dlp yt-dlp.exe yt_dlp.egg-info/ AUTHORS .mailmap
@ -33,7 +33,6 @@ completion-zsh: completions/zsh/_yt-dlp
lazy-extractors: yt_dlp/extractor/lazy_extractors.py lazy-extractors: yt_dlp/extractor/lazy_extractors.py
PREFIX ?= /usr/local PREFIX ?= /usr/local
DESTDIR ?= .
BINDIR ?= $(PREFIX)/bin BINDIR ?= $(PREFIX)/bin
MANDIR ?= $(PREFIX)/man MANDIR ?= $(PREFIX)/man
SHAREDIR ?= $(PREFIX)/share SHAREDIR ?= $(PREFIX)/share
@ -75,17 +74,16 @@ offlinetest: codetest
$(PYTHON) -m pytest -k "not download" $(PYTHON) -m pytest -k "not download"
# XXX: This is hard to maintain # XXX: This is hard to maintain
CODE_FOLDERS = yt_dlp yt_dlp/downloader yt_dlp/extractor yt_dlp/postprocessor yt_dlp/compat \ CODE_FOLDERS = yt_dlp yt_dlp/downloader yt_dlp/extractor yt_dlp/postprocessor yt_dlp/compat
yt_dlp/extractor/anvato_token_generator
yt-dlp: yt_dlp/*.py yt_dlp/*/*.py yt-dlp: yt_dlp/*.py yt_dlp/*/*.py
mkdir -p zip mkdir -p zip
for d in $(CODE_FOLDERS) ; do \ for d in $(CODE_FOLDERS) ; do \
mkdir -p zip/$$d ;\ mkdir -p zip/$$d ;\
cp -pPR $$d/*.py zip/$$d/ ;\ cp -pPR $$d/*.py zip/$$d/ ;\
done done
touch -t 200001010101 zip/yt_dlp/*.py zip/yt_dlp/*/*.py zip/yt_dlp/*/*/*.py touch -t 200001010101 zip/yt_dlp/*.py zip/yt_dlp/*/*.py
mv zip/yt_dlp/__main__.py zip/ mv zip/yt_dlp/__main__.py zip/
cd zip ; zip -q ../yt-dlp yt_dlp/*.py yt_dlp/*/*.py yt_dlp/*/*/*.py __main__.py cd zip ; zip -q ../yt-dlp yt_dlp/*.py yt_dlp/*/*.py __main__.py
rm -rf zip rm -rf zip
echo '#!$(PYTHON)' > yt-dlp echo '#!$(PYTHON)' > yt-dlp
cat yt-dlp.zip >> yt-dlp cat yt-dlp.zip >> yt-dlp
@ -134,7 +132,7 @@ yt_dlp/extractor/lazy_extractors.py: devscripts/make_lazy_extractors.py devscrip
$(PYTHON) devscripts/make_lazy_extractors.py $@ $(PYTHON) devscripts/make_lazy_extractors.py $@
yt-dlp.tar.gz: all yt-dlp.tar.gz: all
@tar -czf $(DESTDIR)/yt-dlp.tar.gz --transform "s|^|yt-dlp/|" --owner 0 --group 0 \ @tar -czf yt-dlp.tar.gz --transform "s|^|yt-dlp/|" --owner 0 --group 0 \
--exclude '*.DS_Store' \ --exclude '*.DS_Store' \
--exclude '*.kate-swp' \ --exclude '*.kate-swp' \
--exclude '*.pyc' \ --exclude '*.pyc' \

435
README.md
View File

@ -3,7 +3,7 @@
[![YT-DLP](https://raw.githubusercontent.com/yt-dlp/yt-dlp/master/.github/banner.svg)](#readme) [![YT-DLP](https://raw.githubusercontent.com/yt-dlp/yt-dlp/master/.github/banner.svg)](#readme)
[![Release version](https://img.shields.io/github/v/release/yt-dlp/yt-dlp?color=brightgreen&label=Download&style=for-the-badge)](#release-files "Release") [![Release version](https://img.shields.io/github/v/release/yt-dlp/yt-dlp?color=brightgreen&label=Download&style=for-the-badge)](#installation "Installation")
[![PyPi](https://img.shields.io/badge/-PyPi-blue.svg?logo=pypi&labelColor=555555&style=for-the-badge)](https://pypi.org/project/yt-dlp "PyPi") [![PyPi](https://img.shields.io/badge/-PyPi-blue.svg?logo=pypi&labelColor=555555&style=for-the-badge)](https://pypi.org/project/yt-dlp "PyPi")
[![Donate](https://img.shields.io/badge/_-Donate-red.svg?logo=githubsponsors&labelColor=555555&style=for-the-badge)](Collaborators.md#collaborators "Donate") [![Donate](https://img.shields.io/badge/_-Donate-red.svg?logo=githubsponsors&labelColor=555555&style=for-the-badge)](Collaborators.md#collaborators "Donate")
[![Matrix](https://img.shields.io/matrix/yt-dlp:matrix.org?color=brightgreen&labelColor=555555&label=&logo=element&style=for-the-badge)](https://matrix.to/#/#yt-dlp:matrix.org "Matrix") [![Matrix](https://img.shields.io/matrix/yt-dlp:matrix.org?color=brightgreen&labelColor=555555&label=&logo=element&style=for-the-badge)](https://matrix.to/#/#yt-dlp:matrix.org "Matrix")
@ -25,6 +25,7 @@
* [NEW FEATURES](#new-features) * [NEW FEATURES](#new-features)
* [Differences in default behavior](#differences-in-default-behavior) * [Differences in default behavior](#differences-in-default-behavior)
* [INSTALLATION](#installation) * [INSTALLATION](#installation)
* [Detailed instructions](https://github.com/yt-dlp/yt-dlp/wiki/Installation)
* [Update](#update) * [Update](#update)
* [Release Files](#release-files) * [Release Files](#release-files)
* [Dependencies](#dependencies) * [Dependencies](#dependencies)
@ -47,9 +48,10 @@
* [SponsorBlock Options](#sponsorblock-options) * [SponsorBlock Options](#sponsorblock-options)
* [Extractor Options](#extractor-options) * [Extractor Options](#extractor-options)
* [CONFIGURATION](#configuration) * [CONFIGURATION](#configuration)
* [Configuration file encoding](#configuration-file-encoding)
* [Authentication with .netrc file](#authentication-with-netrc-file) * [Authentication with .netrc file](#authentication-with-netrc-file)
* [Notes about environment variables](#notes-about-environment-variables)
* [OUTPUT TEMPLATE](#output-template) * [OUTPUT TEMPLATE](#output-template)
* [Output template and Windows batch files](#output-template-and-windows-batch-files)
* [Output template examples](#output-template-examples) * [Output template examples](#output-template-examples)
* [FORMAT SELECTION](#format-selection) * [FORMAT SELECTION](#format-selection)
* [Filtering Formats](#filtering-formats) * [Filtering Formats](#filtering-formats)
@ -65,15 +67,16 @@
* [CONTRIBUTING](CONTRIBUTING.md#contributing-to-yt-dlp) * [CONTRIBUTING](CONTRIBUTING.md#contributing-to-yt-dlp)
* [Opening an Issue](CONTRIBUTING.md#opening-an-issue) * [Opening an Issue](CONTRIBUTING.md#opening-an-issue)
* [Developer Instructions](CONTRIBUTING.md#developer-instructions) * [Developer Instructions](CONTRIBUTING.md#developer-instructions)
* [MORE](#more) * [WIKI](https://github.com/yt-dlp/yt-dlp/wiki)
* [FAQ](https://github.com/yt-dlp/yt-dlp/wiki/FAQ)
<!-- MANPAGE: END EXCLUDED SECTION --> <!-- MANPAGE: END EXCLUDED SECTION -->
# NEW FEATURES # NEW FEATURES
* Merged with **youtube-dl v2021.12.17+ [commit/a03b977](https://github.com/ytdl-org/youtube-dl/commit/a03b9775d544b06a5b4f2aa630214c7c22fc2229)**<!--([exceptions](https://github.com/yt-dlp/yt-dlp/issues/21))--> and **youtube-dlc v2020.11.11-3+ [commit/f9401f2](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee)**: You get all the features and patches of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) in addition to the latest [youtube-dl](https://github.com/ytdl-org/youtube-dl) * Merged with **youtube-dl v2021.12.17+ [commit/ed5c44e](https://github.com/ytdl-org/youtube-dl/commit/ed5c44e7b74ac77f87ca5ed6cb5e964a0c6a0678)**<!--([exceptions](https://github.com/yt-dlp/yt-dlp/issues/21))--> and **youtube-dlc v2020.11.11-3+ [commit/f9401f2](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee)**: You get all the features and patches of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) in addition to the latest [youtube-dl](https://github.com/ytdl-org/youtube-dl)
* **[SponsorBlock Integration](#sponsorblock-options)**: You can mark/remove sponsor sections in youtube videos by utilizing the [SponsorBlock](https://sponsor.ajay.app) API * **[SponsorBlock Integration](#sponsorblock-options)**: You can mark/remove sponsor sections in YouTube videos by utilizing the [SponsorBlock](https://sponsor.ajay.app) API
* **[Format Sorting](#sorting-formats)**: The default format sorting options have been changed so that higher resolution and better codecs will be now preferred instead of simply using larger bitrate. Furthermore, you can now specify the sort order using `-S`. This allows for much easier format selection than what is possible by simply using `--format` ([examples](#format-selection-examples)) * **[Format Sorting](#sorting-formats)**: The default format sorting options have been changed so that higher resolution and better codecs will be now preferred instead of simply using larger bitrate. Furthermore, you can now specify the sort order using `-S`. This allows for much easier format selection than what is possible by simply using `--format` ([examples](#format-selection-examples))
@ -87,7 +90,7 @@ # NEW FEATURES
* `255kbps` audio is extracted (if available) from YouTube Music when premium cookies are given * `255kbps` audio is extracted (if available) from YouTube Music when premium cookies are given
* Redirect channel's home URL automatically to `/video` to preserve the old behaviour * Redirect channel's home URL automatically to `/video` to preserve the old behaviour
* **Cookies from browser**: Cookies can be automatically extracted from all major web browsers using `--cookies-from-browser BROWSER[+KEYRING][:PROFILE]` * **Cookies from browser**: Cookies can be automatically extracted from all major web browsers using `--cookies-from-browser BROWSER[+KEYRING][:PROFILE][::CONTAINER]`
* **Download time range**: Videos can be downloaded partially based on either timestamps or chapters using `--download-sections` * **Download time range**: Videos can be downloaded partially based on either timestamps or chapters using `--download-sections`
@ -139,14 +142,15 @@ ### Differences in default behavior
* `playlist_index` behaves differently when used with options like `--playlist-reverse` and `--playlist-items`. See [#302](https://github.com/yt-dlp/yt-dlp/issues/302) for details. You can use `--compat-options playlist-index` if you want to keep the earlier behavior * `playlist_index` behaves differently when used with options like `--playlist-reverse` and `--playlist-items`. See [#302](https://github.com/yt-dlp/yt-dlp/issues/302) for details. You can use `--compat-options playlist-index` if you want to keep the earlier behavior
* The output of `-F` is listed in a new format. Use `--compat-options list-formats` to revert this * The output of `-F` is listed in a new format. Use `--compat-options list-formats` to revert this
* Live chats (if available) are considered as subtitles. Use `--sub-langs all,-live_chat` to download all subtitles except live chat. You can also use `--compat-options no-live-chat` to prevent any live chat/danmaku from downloading * Live chats (if available) are considered as subtitles. Use `--sub-langs all,-live_chat` to download all subtitles except live chat. You can also use `--compat-options no-live-chat` to prevent any live chat/danmaku from downloading
* Youtube channel URLs are automatically redirected to `/video`. Append a `/featured` to the URL to download only the videos in the home page. If the channel does not have a videos tab, we try to download the equivalent `UU` playlist instead. For all other tabs, if the channel does not show the requested tab, an error will be raised. Also, `/live` URLs raise an error if there are no live videos instead of silently downloading the entire channel. You may use `--compat-options no-youtube-channel-redirect` to revert all these redirections * YouTube channel URLs are automatically redirected to `/video`. Append a `/featured` to the URL to download only the videos in the home page. If the channel does not have a videos tab, we try to download the equivalent `UU` playlist instead. For all other tabs, if the channel does not show the requested tab, an error will be raised. Also, `/live` URLs raise an error if there are no live videos instead of silently downloading the entire channel. You may use `--compat-options no-youtube-channel-redirect` to revert all these redirections
* Unavailable videos are also listed for youtube playlists. Use `--compat-options no-youtube-unavailable-videos` to remove this * Unavailable videos are also listed for YouTube playlists. Use `--compat-options no-youtube-unavailable-videos` to remove this
* The upload dates extracted from YouTube are in UTC [when available](https://github.com/yt-dlp/yt-dlp/blob/89e4d86171c7b7c997c77d4714542e0383bf0db0/yt_dlp/extractor/youtube.py#L3898-L3900). Use `--compat-options no-youtube-prefer-utc-upload-date` to prefer the non-UTC upload date.
* If `ffmpeg` is used as the downloader, the downloading and merging of formats happen in a single step when possible. Use `--compat-options no-direct-merge` to revert this * If `ffmpeg` is used as the downloader, the downloading and merging of formats happen in a single step when possible. Use `--compat-options no-direct-merge` to revert this
* Thumbnail embedding in `mp4` is done with mutagen if possible. Use `--compat-options embed-thumbnail-atomicparsley` to force the use of AtomicParsley instead * Thumbnail embedding in `mp4` is done with mutagen if possible. Use `--compat-options embed-thumbnail-atomicparsley` to force the use of AtomicParsley instead
* Some private fields such as filenames are removed by default from the infojson. Use `--no-clean-infojson` or `--compat-options no-clean-infojson` to revert this * Some private fields such as filenames are removed by default from the infojson. Use `--no-clean-infojson` or `--compat-options no-clean-infojson` to revert this
* When `--embed-subs` and `--write-subs` are used together, the subtitles are written to disk and also embedded in the media file. You can use just `--embed-subs` to embed the subs and automatically delete the separate file. See [#630 (comment)](https://github.com/yt-dlp/yt-dlp/issues/630#issuecomment-893659460) for more info. `--compat-options no-keep-subs` can be used to revert this * When `--embed-subs` and `--write-subs` are used together, the subtitles are written to disk and also embedded in the media file. You can use just `--embed-subs` to embed the subs and automatically delete the separate file. See [#630 (comment)](https://github.com/yt-dlp/yt-dlp/issues/630#issuecomment-893659460) for more info. `--compat-options no-keep-subs` can be used to revert this
* `certifi` will be used for SSL root certificates, if installed. If you want to use system certificates (e.g. self-signed), use `--compat-options no-certifi` * `certifi` will be used for SSL root certificates, if installed. If you want to use system certificates (e.g. self-signed), use `--compat-options no-certifi`
* youtube-dl tries to remove some superfluous punctuations from filenames. While this can sometimes be helpful, it is often undesirable. So yt-dlp tries to keep the fields in the filenames as close to their original values as possible. You can use `--compat-options filename-sanitization` to revert to youtube-dl's behavior * yt-dlp's sanitization of invalid characters in filenames is different/smarter than in youtube-dl. You can use `--compat-options filename-sanitization` to revert to youtube-dl's behavior
For ease of use, a few more compat options are available: For ease of use, a few more compat options are available:
@ -157,76 +161,26 @@ ### Differences in default behavior
# INSTALLATION # INSTALLATION
You can install yt-dlp using one of the following methods:
### Using the release binary
You can simply download the [correct binary file](#release-files) for your OS
<!-- MANPAGE: BEGIN EXCLUDED SECTION --> <!-- MANPAGE: BEGIN EXCLUDED SECTION -->
[![Windows](https://img.shields.io/badge/-Windows_x64-blue.svg?style=for-the-badge&logo=windows)](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp.exe) [![Windows](https://img.shields.io/badge/-Windows_x64-blue.svg?style=for-the-badge&logo=windows)](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp.exe)
[![Linux](https://img.shields.io/badge/-Linux/BSD-red.svg?style=for-the-badge&logo=linux)](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp) [![Unix](https://img.shields.io/badge/-Linux/BSD-red.svg?style=for-the-badge&logo=linux)](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp)
[![MacOS](https://img.shields.io/badge/-MacOS-lightblue.svg?style=for-the-badge&logo=apple)](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_macos) [![MacOS](https://img.shields.io/badge/-MacOS-lightblue.svg?style=for-the-badge&logo=apple)](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_macos)
[![PyPi](https://img.shields.io/badge/-PyPi-blue.svg?logo=pypi&labelColor=555555&style=for-the-badge)](https://pypi.org/project/yt-dlp)
[![Source Tarball](https://img.shields.io/badge/-Source_tar-green.svg?style=for-the-badge)](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp.tar.gz) [![Source Tarball](https://img.shields.io/badge/-Source_tar-green.svg?style=for-the-badge)](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp.tar.gz)
[![Other variants](https://img.shields.io/badge/-Other-grey.svg?style=for-the-badge)](#release-files) [![Other variants](https://img.shields.io/badge/-Other-grey.svg?style=for-the-badge)](#release-files)
[![All versions](https://img.shields.io/badge/-All_Versions-lightgrey.svg?style=for-the-badge)](https://github.com/yt-dlp/yt-dlp/releases) [![All versions](https://img.shields.io/badge/-All_Versions-lightgrey.svg?style=for-the-badge)](https://github.com/yt-dlp/yt-dlp/releases)
<!-- MANPAGE: END EXCLUDED SECTION --> <!-- MANPAGE: END EXCLUDED SECTION -->
Note: The manpages, shell completion files etc. are available in the [source tarball](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp.tar.gz) You can install yt-dlp using [the binaries](#release-files), [PIP](https://pypi.org/project/yt-dlp) or one using a third-party package manager. See [the wiki](https://github.com/yt-dlp/yt-dlp/wiki/Installation) for detailed instructions
<!-- TODO: Move to Wiki -->
In UNIX-like OSes (MacOS, Linux, BSD), you can also install the same in one of the following ways:
```
sudo curl -L https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp -o /usr/local/bin/yt-dlp
sudo chmod a+rx /usr/local/bin/yt-dlp
```
```
sudo wget https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp -O /usr/local/bin/yt-dlp
sudo chmod a+rx /usr/local/bin/yt-dlp
```
```
sudo aria2c https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp --dir /usr/local/bin -o yt-dlp
sudo chmod a+rx /usr/local/bin/yt-dlp
```
### With [PIP](https://pypi.org/project/pip)
You can install the [PyPI package](https://pypi.org/project/yt-dlp) with:
```
python3 -m pip install -U yt-dlp
```
You can install without any of the optional dependencies using:
```
python3 -m pip install --no-deps -U yt-dlp
```
If you want to be on the cutting edge, you can also install the master branch with:
```
python3 -m pip install --force-reinstall https://github.com/yt-dlp/yt-dlp/archive/master.tar.gz
```
On some systems, you may need to use `py` or `python` instead of `python3`
<!-- TODO: Add to Wiki, Remove Taps -->
### With [Homebrew](https://brew.sh)
macOS or Linux users that are using Homebrew can also install it by:
```
brew install yt-dlp/taps/yt-dlp
```
## UPDATE ## UPDATE
You can use `yt-dlp -U` to update if you are [using the provided release](#using-the-release-binary) You can use `yt-dlp -U` to update if you are [using the release binaries](#release-files)
If you [installed with pip](#with-pip), simply re-run the same command that was used to install the program If you [installed with PIP](https://github.com/yt-dlp/yt-dlp/wiki/Installation#with-pip), simply re-run the same command that was used to install the program
For other third-party package managers, see [the wiki](https://github.com/yt-dlp/yt-dlp/wiki/Installation) or refer their documentation
If you [installed using Homebrew](#with-homebrew), run `brew upgrade yt-dlp/taps/yt-dlp`
<!-- MANPAGE: BEGIN EXCLUDED SECTION --> <!-- MANPAGE: BEGIN EXCLUDED SECTION -->
## RELEASE FILES ## RELEASE FILES
@ -255,11 +209,14 @@ #### Misc
File|Description File|Description
:---|:--- :---|:---
[yt-dlp.tar.gz](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp.tar.gz)|Source tarball. Also contains manpages, completions, etc [yt-dlp.tar.gz](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp.tar.gz)|Source tarball
[SHA2-512SUMS](https://github.com/yt-dlp/yt-dlp/releases/latest/download/SHA2-512SUMS)|GNU-style SHA512 sums [SHA2-512SUMS](https://github.com/yt-dlp/yt-dlp/releases/latest/download/SHA2-512SUMS)|GNU-style SHA512 sums
[SHA2-256SUMS](https://github.com/yt-dlp/yt-dlp/releases/latest/download/SHA2-256SUMS)|GNU-style SHA256 sums [SHA2-256SUMS](https://github.com/yt-dlp/yt-dlp/releases/latest/download/SHA2-256SUMS)|GNU-style SHA256 sums
<!-- MANPAGE: END EXCLUDED SECTION --> <!-- MANPAGE: END EXCLUDED SECTION -->
Note: The manpages, shell completion files etc. are available in the [source tarball](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp.tar.gz)
## DEPENDENCIES ## DEPENDENCIES
Python versions 3.7+ (CPython and PyPy) are supported. Other versions and implementations may or may not work correctly. Python versions 3.7+ (CPython and PyPy) are supported. Other versions and implementations may or may not work correctly.
@ -295,7 +252,7 @@ ### Misc
* [**secretstorage**](https://github.com/mitya57/secretstorage) - For `--cookies-from-browser` to access the **Gnome** keyring while decrypting cookies of **Chromium**-based browsers on **Linux**. Licensed under [BSD-3-Clause](https://github.com/mitya57/secretstorage/blob/master/LICENSE) * [**secretstorage**](https://github.com/mitya57/secretstorage) - For `--cookies-from-browser` to access the **Gnome** keyring while decrypting cookies of **Chromium**-based browsers on **Linux**. Licensed under [BSD-3-Clause](https://github.com/mitya57/secretstorage/blob/master/LICENSE)
* Any external downloader that you want to use with `--downloader` * Any external downloader that you want to use with `--downloader`
#### Deprecated ### Deprecated
* [**avconv** and **avprobe**](https://www.libav.org) - Now **deprecated** alternative to ffmpeg. License [depends on the build](https://libav.org/legal) * [**avconv** and **avprobe**](https://www.libav.org) - Now **deprecated** alternative to ffmpeg. License [depends on the build](https://libav.org/legal)
* [**sponskrub**](https://github.com/faissaloo/SponSkrub) - For using the now **deprecated** [sponskrub options](#sponskrub-options). Licensed under [GPLv3+](https://github.com/faissaloo/SponSkrub/blob/master/LICENCE.md) * [**sponskrub**](https://github.com/faissaloo/SponSkrub) - For using the now **deprecated** [sponskrub options](#sponskrub-options). Licensed under [GPLv3+](https://github.com/faissaloo/SponSkrub/blob/master/LICENCE.md)
@ -312,7 +269,7 @@ #### Deprecated
## COMPILE ## COMPILE
### Standalone PyInstaller Builds ### Standalone PyInstaller Builds
To build the Windows/MacOS executable, you must have Python and `pyinstaller` (plus any of yt-dlp's [optional dependencies](#dependencies) if needed). Once you have all the necessary dependencies installed, simply run `pyinst.py`. The executable will be built for the same architecture (32/64 bit) as the Python used. To build the standalone executable, you must have Python and `pyinstaller` (plus any of yt-dlp's [optional dependencies](#dependencies) if needed). Once you have all the necessary dependencies installed, simply run `pyinst.py`. The executable will be built for the same architecture (x86/ARM, 32/64 bit) as the Python used.
python3 -m pip install -U pyinstaller -r requirements.txt python3 -m pip install -U pyinstaller -r requirements.txt
python3 devscripts/make_lazy_extractors.py python3 devscripts/make_lazy_extractors.py
@ -320,16 +277,18 @@ ### Standalone PyInstaller Builds
On some systems, you may need to use `py` or `python` instead of `python3`. On some systems, you may need to use `py` or `python` instead of `python3`.
Note that pyinstaller [does not support](https://github.com/pyinstaller/pyinstaller#requirements-and-tested-platforms) Python installed from the Windows store without using a virtual environment. `pyinst.py` accepts any arguments that can be passed to `pyinstaller`, such as `--onefile/-F` or `--onedir/-D`, which is further [documented here](https://pyinstaller.org/en/stable/usage.html#what-to-generate).
Note that pyinstaller with versions below 4.4 [do not support](https://github.com/pyinstaller/pyinstaller#requirements-and-tested-platforms) Python installed from the Windows store without using a virtual environment.
**Important**: Running `pyinstaller` directly **without** using `pyinst.py` is **not** officially supported. This may or may not work correctly. **Important**: Running `pyinstaller` directly **without** using `pyinst.py` is **not** officially supported. This may or may not work correctly.
### Platform-independent Binary (UNIX) ### Platform-independent Binary (UNIX)
You will need the build tools `python` (3.6+), `zip`, `make` (GNU), `pandoc`\* and `pytest`\*. You will need the build tools `python` (3.7+), `zip`, `make` (GNU), `pandoc`\* and `pytest`\*.
After installing these, simply run `make`. After installing these, simply run `make`.
You can also run `make yt-dlp` instead to compile only the binary without updating any of the additional files. (The dependencies marked with **\*** are not needed for this) You can also run `make yt-dlp` instead to compile only the binary without updating any of the additional files. (The build tools marked with **\*** are not needed for this)
### Standalone Py2Exe Builds (Windows) ### Standalone Py2Exe Builds (Windows)
@ -343,10 +302,11 @@ ### Standalone Py2Exe Builds (Windows)
### Related scripts ### Related scripts
* **`devscripts/update-version.py`** - Update the version number based on current timestamp * **`devscripts/update-version.py [revision]`** - Update the version number based on current date
* **`devscripts/set-variant.py variant [-M update_message]`** - Set the build variant of the executable
* **`devscripts/make_lazy_extractors.py`** - Create lazy extractors. Running this before building the binaries (any variant) will improve their startup performance. Set the environment variable `YTDLP_NO_LAZY_EXTRACTORS=1` if you wish to forcefully disable lazy extractor loading. * **`devscripts/make_lazy_extractors.py`** - Create lazy extractors. Running this before building the binaries (any variant) will improve their startup performance. Set the environment variable `YTDLP_NO_LAZY_EXTRACTORS=1` if you wish to forcefully disable lazy extractor loading.
You can also fork the project on github and run your fork's [build workflow](.github/workflows/build.yml) to automatically build a full release You can also fork the project on GitHub and run your fork's [build workflow](.github/workflows/build.yml) to automatically build a full release
# USAGE AND OPTIONS # USAGE AND OPTIONS
@ -360,8 +320,8 @@ # USAGE AND OPTIONS
## General Options: ## General Options:
-h, --help Print this help text and exit -h, --help Print this help text and exit
--version Print program version and exit --version Print program version and exit
-U, --update Update this program to latest version -U, --update Update this program to the latest version
--no-update Do not update (default) --no-update Do not check for updates (default)
-i, --ignore-errors Ignore download and postprocessing errors. -i, --ignore-errors Ignore download and postprocessing errors.
The download will be considered successful The download will be considered successful
even if the postprocessing fails even if the postprocessing fails
@ -374,8 +334,14 @@ ## General Options:
--list-extractors List all supported extractors and exit --list-extractors List all supported extractors and exit
--extractor-descriptions Output descriptions of all supported --extractor-descriptions Output descriptions of all supported
extractors and exit extractors and exit
--force-generic-extractor Force extraction to use the generic extractor --use-extractors NAMES Extractor names to use separated by commas.
--default-search PREFIX Use this prefix for unqualified URLs. Eg: You can also use regexes, "all", "default"
and "end" (end URL matching); e.g. --ies
"holodex.*,end,youtube". Prefix the name
with a "-" to exclude it, e.g. --ies
default,-generic. Use --list-extractors for
a list of extractor names. (Alias: --ies)
--default-search PREFIX Use this prefix for unqualified URLs. E.g.
"gvsearch2:python" downloads two videos from "gvsearch2:python" downloads two videos from
google videos for the search term "python". google videos for the search term "python".
Use the value "auto" to let yt-dlp guess Use the value "auto" to let yt-dlp guess
@ -424,7 +390,7 @@ ## General Options:
an alias starts with a dash "-", it is an alias starts with a dash "-", it is
prefixed with "--". Arguments are parsed prefixed with "--". Arguments are parsed
according to the Python string formatting according to the Python string formatting
mini-language. Eg: --alias get-audio,-X mini-language. E.g. --alias get-audio,-X
"-S=aext:{0},abr -x --audio-format {0}" "-S=aext:{0},abr -x --audio-format {0}"
creates options "--get-audio" and "-X" that creates options "--get-audio" and "-X" that
takes an argument (ARG0) and expands to takes an argument (ARG0) and expands to
@ -438,10 +404,10 @@ ## General Options:
## Network Options: ## Network Options:
--proxy URL Use the specified HTTP/HTTPS/SOCKS proxy. To --proxy URL Use the specified HTTP/HTTPS/SOCKS proxy. To
enable SOCKS proxy, specify a proper scheme. enable SOCKS proxy, specify a proper scheme,
Eg: socks5://user:pass@127.0.0.1:1080/. Pass e.g. socks5://user:pass@127.0.0.1:1080/.
in an empty string (--proxy "") for direct Pass in an empty string (--proxy "") for
connection direct connection
--socket-timeout SECONDS Time to wait before giving up, in seconds --socket-timeout SECONDS Time to wait before giving up, in seconds
--source-address IP Client-side IP address to bind to --source-address IP Client-side IP address to bind to
-4, --force-ipv4 Make all connections via IPv4 -4, --force-ipv4 Make all connections via IPv4
@ -470,17 +436,17 @@ ## Video Selection:
compatibility, START-STOP is also supported. compatibility, START-STOP is also supported.
Use negative indices to count from the right Use negative indices to count from the right
and negative STEP to download in reverse and negative STEP to download in reverse
order. Eg: "-I 1:3,7,-5::2" used on a order. E.g. "-I 1:3,7,-5::2" used on a
playlist of size 15 will download the videos playlist of size 15 will download the videos
at index 1,2,3,7,11,13,15 at index 1,2,3,7,11,13,15
--min-filesize SIZE Do not download any videos smaller than SIZE --min-filesize SIZE Do not download any videos smaller than
(e.g. 50k or 44.6m) SIZE, e.g. 50k or 44.6M
--max-filesize SIZE Do not download any videos larger than SIZE --max-filesize SIZE Do not download any videos larger than SIZE,
(e.g. 50k or 44.6m) e.g. 50k or 44.6M
--date DATE Download only videos uploaded on this date. --date DATE Download only videos uploaded on this date.
The date can be "YYYYMMDD" or in the format The date can be "YYYYMMDD" or in the format
[now|today|yesterday][-N[day|week|month|year]]. [now|today|yesterday][-N[day|week|month|year]].
Eg: --date today-2weeks E.g. --date today-2weeks
--datebefore DATE Download only videos uploaded on or before --datebefore DATE Download only videos uploaded on or before
this date. The date formats accepted is the this date. The date formats accepted is the
same as --date same as --date
@ -497,7 +463,7 @@ ## Video Selection:
conditions. Use a "\" to escape "&" or conditions. Use a "\" to escape "&" or
quotes if needed. If used multiple times, quotes if needed. If used multiple times,
the filter matches if atleast one of the the filter matches if atleast one of the
conditions are met. Eg: --match-filter conditions are met. E.g. --match-filter
!is_live --match-filter "like_count>?100 & !is_live --match-filter "like_count>?100 &
description~='(?i)\bcats \& dogs\b'" matches description~='(?i)\bcats \& dogs\b'" matches
only videos that are not live OR those that only videos that are not live OR those that
@ -523,8 +489,8 @@ ## Video Selection:
a file that is in the archive a file that is in the archive
--break-on-reject Stop the download process when encountering --break-on-reject Stop the download process when encountering
a file that has been filtered out a file that has been filtered out
--break-per-input Make --break-on-existing, --break-on-reject --break-per-input --break-on-existing, --break-on-reject,
and --max-downloads act only on the current --max-downloads, and autonumber resets per
input URL input URL
--no-break-per-input --break-on-existing and similar options --no-break-per-input --break-on-existing and similar options
terminates the entire download queue terminates the entire download queue
@ -535,11 +501,11 @@ ## Download Options:
-N, --concurrent-fragments N Number of fragments of a dash/hlsnative -N, --concurrent-fragments N Number of fragments of a dash/hlsnative
video that should be downloaded concurrently video that should be downloaded concurrently
(default is 1) (default is 1)
-r, --limit-rate RATE Maximum download rate in bytes per second -r, --limit-rate RATE Maximum download rate in bytes per second,
(e.g. 50K or 4.2M) e.g. 50K or 4.2M
--throttled-rate RATE Minimum download rate in bytes per second --throttled-rate RATE Minimum download rate in bytes per second
below which throttling is assumed and the below which throttling is assumed and the
video data is re-extracted (e.g. 100K) video data is re-extracted, e.g. 100K
-R, --retries RETRIES Number of retries (default is 10), or -R, --retries RETRIES Number of retries (default is 10), or
"infinite" "infinite"
--file-access-retries RETRIES Number of times to retry on file access --file-access-retries RETRIES Number of times to retry on file access
@ -553,7 +519,7 @@ ## Download Options:
be a number, linear=START[:END[:STEP=1]] or be a number, linear=START[:END[:STEP=1]] or
exp=START[:END[:BASE=2]]. This option can be exp=START[:END[:BASE=2]]. This option can be
used multiple times to set the sleep for the used multiple times to set the sleep for the
different retry types. Eg: --retry-sleep different retry types, e.g. --retry-sleep
linear=1::2 --retry-sleep fragment:exp=1:20 linear=1::2 --retry-sleep fragment:exp=1:20
--skip-unavailable-fragments Skip unavailable fragments for DASH, --skip-unavailable-fragments Skip unavailable fragments for DASH,
hlsnative and ISM downloads (default) hlsnative and ISM downloads (default)
@ -565,14 +531,14 @@ ## Download Options:
downloading is finished downloading is finished
--no-keep-fragments Delete downloaded fragments after --no-keep-fragments Delete downloaded fragments after
downloading is finished (default) downloading is finished (default)
--buffer-size SIZE Size of download buffer (e.g. 1024 or 16K) --buffer-size SIZE Size of download buffer, e.g. 1024 or 16K
(default is 1024) (default is 1024)
--resize-buffer The buffer size is automatically resized --resize-buffer The buffer size is automatically resized
from an initial value of --buffer-size from an initial value of --buffer-size
(default) (default)
--no-resize-buffer Do not automatically adjust the buffer size --no-resize-buffer Do not automatically adjust the buffer size
--http-chunk-size SIZE Size of a chunk for chunk-based HTTP --http-chunk-size SIZE Size of a chunk for chunk-based HTTP
downloading (e.g. 10485760 or 10M) (default downloading, e.g. 10485760 or 10M (default
is disabled). May be useful for bypassing is disabled). May be useful for bypassing
bandwidth throttling imposed by a webserver bandwidth throttling imposed by a webserver
(experimental) (experimental)
@ -597,10 +563,10 @@ ## Download Options:
the given regular expression. Time ranges the given regular expression. Time ranges
prefixed by a "*" can also be used in place prefixed by a "*" can also be used in place
of chapters to download the specified range. of chapters to download the specified range.
Eg: --download-sections "*10:15-15:00" Needs ffmpeg. This option can be used
--download-sections "intro". Needs ffmpeg. multiple times to download multiple
This option can be used multiple times to sections, e.g. --download-sections
download multiple sections "*10:15-inf" --download-sections "intro"
--downloader [PROTO:]NAME Name or path of the external downloader to --downloader [PROTO:]NAME Name or path of the external downloader to
use (optionally) prefixed by the protocols use (optionally) prefixed by the protocols
(http, ftp, m3u8, dash, rstp, rtmp, mms) to (http, ftp, m3u8, dash, rstp, rtmp, mms) to
@ -608,7 +574,7 @@ ## Download Options:
aria2c, avconv, axel, curl, ffmpeg, httpie, aria2c, avconv, axel, curl, ffmpeg, httpie,
wget. You can use this option multiple times wget. You can use this option multiple times
to set different downloaders for different to set different downloaders for different
protocols. For example, --downloader aria2c protocols. E.g. --downloader aria2c
--downloader "dash,m3u8:native" will use --downloader "dash,m3u8:native" will use
aria2c for http/ftp downloads, and the aria2c for http/ftp downloads, and the
native downloader for dash/m3u8 downloads native downloader for dash/m3u8 downloads
@ -699,24 +665,25 @@ ## Filesystem Options:
and dump cookie jar in and dump cookie jar in
--no-cookies Do not read/dump cookies from/to file --no-cookies Do not read/dump cookies from/to file
(default) (default)
--cookies-from-browser BROWSER[+KEYRING][:PROFILE] --cookies-from-browser BROWSER[+KEYRING][:PROFILE][::CONTAINER]
The name of the browser and (optionally) the The name of the browser to load cookies
name/path of the profile to load cookies from. Currently supported browsers are:
from, separated by a ":". Currently brave, chrome, chromium, edge, firefox,
supported browsers are: brave, chrome, opera, safari, vivaldi. Optionally, the
chromium, edge, firefox, opera, safari, KEYRING used for decrypting Chromium cookies
vivaldi. By default, the most recently on Linux, the name/path of the PROFILE to
accessed profile is used. The keyring used load cookies from, and the CONTAINER name
for decrypting Chromium cookies on Linux can (if Firefox) ("none" for no container) can
be (optionally) specified after the browser be given with their respective seperators.
name separated by a "+". Currently supported By default, all containers of the most
keyrings are: basictext, gnomekeyring, kwallet recently accessed profile are used.
Currently supported keyrings are: basictext,
gnomekeyring, kwallet
--no-cookies-from-browser Do not load cookies from browser (default) --no-cookies-from-browser Do not load cookies from browser (default)
--cache-dir DIR Location in the filesystem where youtube-dl --cache-dir DIR Location in the filesystem where yt-dlp can
can store some downloaded information (such store some downloaded information (such as
as client ids and signatures) permanently. client ids and signatures) permanently. By
By default $XDG_CACHE_HOME/yt-dlp or default ${XDG_CACHE_HOME}/yt-dlp
~/.cache/yt-dlp
--no-cache-dir Disable filesystem caching --no-cache-dir Disable filesystem caching
--rm-cache-dir Delete all filesystem cache files --rm-cache-dir Delete all filesystem cache files
@ -790,7 +757,7 @@ ## Verbosity and Simulation Options:
"postprocess:", or "postprocess-title:". "postprocess:", or "postprocess-title:".
The video's fields are accessible under the The video's fields are accessible under the
"info" key and the progress attributes are "info" key and the progress attributes are
accessible under "progress" key. E.g.: accessible under "progress" key. E.g.
--console-title --progress-template --console-title --progress-template
"download-title:%(info.id)s-%(progress.eta)s" "download-title:%(info.id)s-%(progress.eta)s"
-v, --verbose Print various debugging information -v, --verbose Print various debugging information
@ -859,7 +826,7 @@ ## Video Format Options:
-F, --list-formats List available formats of each video. -F, --list-formats List available formats of each video.
Simulate unless --no-simulate is used Simulate unless --no-simulate is used
--merge-output-format FORMAT Containers that may be used when merging --merge-output-format FORMAT Containers that may be used when merging
formats, separated by "/" (Eg: "mp4/mkv"). formats, separated by "/", e.g. "mp4/mkv".
Ignored if no merge is required. (currently Ignored if no merge is required. (currently
supported: avi, flv, mkv, mov, mp4, webm) supported: avi, flv, mkv, mov, mp4, webm)
@ -873,13 +840,13 @@ ## Subtitle Options:
--list-subs List available subtitles of each video. --list-subs List available subtitles of each video.
Simulate unless --no-simulate is used Simulate unless --no-simulate is used
--sub-format FORMAT Subtitle format; accepts formats preference, --sub-format FORMAT Subtitle format; accepts formats preference,
Eg: "srt" or "ass/srt/best" e.g. "srt" or "ass/srt/best"
--sub-langs LANGS Languages of the subtitles to download (can --sub-langs LANGS Languages of the subtitles to download (can
be regex) or "all" separated by commas. (Eg: be regex) or "all" separated by commas, e.g.
--sub-langs "en.*,ja") You can prefix the --sub-langs "en.*,ja". You can prefix the
language code with a "-" to exclude it from language code with a "-" to exclude it from
the requested languages. (Eg: --sub-langs the requested languages, e.g. --sub-langs
all,-live_chat) Use --list-subs for a list all,-live_chat. Use --list-subs for a list
of available language tags of available language tags
## Authentication Options: ## Authentication Options:
@ -928,7 +895,7 @@ ## Post-Processing Options:
m4a, mka, mp3, ogg, opus, vorbis, wav). If m4a, mka, mp3, ogg, opus, vorbis, wav). If
target container does not support the target container does not support the
video/audio codec, remuxing will fail. You video/audio codec, remuxing will fail. You
can specify multiple rules; Eg. can specify multiple rules; e.g.
"aac>m4a/mov>mp4/mkv" will remux aac to m4a, "aac>m4a/mov>mp4/mkv" will remux aac to m4a,
mov to mp4 and anything else to mkv mov to mp4 and anything else to mkv
--recode-video FORMAT Re-encode the video into another format if --recode-video FORMAT Re-encode the video into another format if
@ -953,7 +920,7 @@ ## Post-Processing Options:
for ffmpeg/ffprobe, "_i"/"_o" can be for ffmpeg/ffprobe, "_i"/"_o" can be
appended to the prefix optionally followed appended to the prefix optionally followed
by a number to pass the argument before the by a number to pass the argument before the
specified input/output file. Eg: --ppa specified input/output file, e.g. --ppa
"Merger+ffmpeg_i1:-v quiet". You can use "Merger+ffmpeg_i1:-v quiet". You can use
this option multiple times to give different this option multiple times to give different
arguments to different postprocessors. arguments to different postprocessors.
@ -1077,10 +1044,10 @@ ## SponsorBlock Options:
for, separated by commas. Available for, separated by commas. Available
categories are sponsor, intro, outro, categories are sponsor, intro, outro,
selfpromo, preview, filler, interaction, selfpromo, preview, filler, interaction,
music_offtopic, poi_highlight, all and music_offtopic, poi_highlight, chapter, all and
default (=all). You can prefix the category default (=all). You can prefix the category
with a "-" to exclude it. See [1] for with a "-" to exclude it. See [1] for
description of the categories. Eg: description of the categories. E.g.
--sponsorblock-mark all,-preview --sponsorblock-mark all,-preview
[1] https://wiki.sponsor.ajay.app/w/Segment_Categories [1] https://wiki.sponsor.ajay.app/w/Segment_Categories
--sponsorblock-remove CATS SponsorBlock categories to be removed from --sponsorblock-remove CATS SponsorBlock categories to be removed from
@ -1089,8 +1056,8 @@ ## SponsorBlock Options:
remove takes precedence. The syntax and remove takes precedence. The syntax and
available categories are the same as for available categories are the same as for
--sponsorblock-mark except that "default" --sponsorblock-mark except that "default"
refers to "all,-filler" and poi_highlight is refers to "all,-filler" and poi_highlight and
not available chapter are not available
--sponsorblock-chapter-title TEMPLATE --sponsorblock-chapter-title TEMPLATE
An output template for the title of the An output template for the title of the
SponsorBlock chapters created by SponsorBlock chapters created by
@ -1115,31 +1082,36 @@ ## Extractor Options:
--no-hls-split-discontinuity Do not split HLS playlists to different --no-hls-split-discontinuity Do not split HLS playlists to different
formats at discontinuities such as ad breaks formats at discontinuities such as ad breaks
(default) (default)
--extractor-args KEY:ARGS Pass these arguments to the extractor. See --extractor-args IE_KEY:ARGS Pass ARGS arguments to the IE_KEY extractor.
"EXTRACTOR ARGUMENTS" for details. You can See "EXTRACTOR ARGUMENTS" for details. You
use this option multiple times to give can use this option multiple times to give
arguments for different extractors arguments for different extractors
# CONFIGURATION # CONFIGURATION
You can configure yt-dlp by placing any supported command line option to a configuration file. The configuration is loaded from the following locations: You can configure yt-dlp by placing any supported command line option to a configuration file. The configuration is loaded from the following locations:
1. **Main Configuration**: The file given by `--config-location` 1. **Main Configuration**:
1. **Portable Configuration**: `yt-dlp.conf` in the same directory as the bundled binary. If you are running from source-code (`<root dir>/yt_dlp/__main__.py`), the root directory is used instead. * The file given by `--config-location`
1. **Home Configuration**: `yt-dlp.conf` in the home path given by `-P`, or in the current directory if no such path is given 1. **Portable Configuration**: (Recommended for portable installations)
* If using a binary, `yt-dlp.conf` in the same directory as the binary
* If running from source-code, `yt-dlp.conf` in the parent directory of `yt_dlp`
1. **Home Configuration**:
* `yt-dlp.conf` in the home path given by `-P`
* If `-P` is not given, the current directory is searched
1. **User Configuration**: 1. **User Configuration**:
* `%XDG_CONFIG_HOME%/yt-dlp/config` (recommended on Linux/macOS) * `${XDG_CONFIG_HOME}/yt-dlp/config` (recommended on Linux/macOS)
* `%XDG_CONFIG_HOME%/yt-dlp.conf` * `${XDG_CONFIG_HOME}/yt-dlp.conf`
* `%APPDATA%/yt-dlp/config` (recommended on Windows) * `${APPDATA}/yt-dlp/config` (recommended on Windows)
* `%APPDATA%/yt-dlp/config.txt` * `${APPDATA}/yt-dlp/config.txt`
* `~/yt-dlp.conf` * `~/yt-dlp.conf`
* `~/yt-dlp.conf.txt` * `~/yt-dlp.conf.txt`
`%XDG_CONFIG_HOME%` defaults to `~/.config` if undefined. On windows, `%APPDATA%` generally points to `C:\Users\<user name>\AppData\Roaming` and `~` points to `%HOME%` if present, `%USERPROFILE%` (generally `C:\Users\<user name>`), or `%HOMEDRIVE%%HOMEPATH%` See also: [Notes about environment variables](#notes-about-environment-variables)
1. **System Configuration**:
* `/etc/yt-dlp.conf`
1. **System Configuration**: `/etc/yt-dlp.conf` E.g. with the following configuration file yt-dlp will always extract the audio, not copy the mtime, use a proxy and save all videos under `YouTube` directory in your home directory:
For example, with the following configuration file yt-dlp will always extract the audio, not copy the mtime, use a proxy and save all videos under `YouTube` directory in your home directory:
``` ```
# Lines starting with # are comments # Lines starting with # are comments
@ -1156,35 +1128,42 @@ # Save all videos under YouTube directory in your home directory
-o ~/YouTube/%(title)s.%(ext)s -o ~/YouTube/%(title)s.%(ext)s
``` ```
Note that options in configuration file are just the same options aka switches used in regular command line calls; thus there **must be no whitespace** after `-` or `--`, e.g. `-o` or `--proxy` but not `- o` or `-- proxy`. Note that options in configuration file are just the same options aka switches used in regular command line calls; thus there **must be no whitespace** after `-` or `--`, e.g. `-o` or `--proxy` but not `- o` or `-- proxy`. They must also be quoted when necessary as-if it were a UNIX shell.
You can use `--ignore-config` if you want to disable all configuration files for a particular yt-dlp run. If `--ignore-config` is found inside any configuration file, no further configuration will be loaded. For example, having the option in the portable configuration file prevents loading of home, user, and system configurations. Additionally, (for backward compatibility) if `--ignore-config` is found inside the system configuration file, the user configuration is not loaded. You can use `--ignore-config` if you want to disable all configuration files for a particular yt-dlp run. If `--ignore-config` is found inside any configuration file, no further configuration will be loaded. For example, having the option in the portable configuration file prevents loading of home, user, and system configurations. Additionally, (for backward compatibility) if `--ignore-config` is found inside the system configuration file, the user configuration is not loaded.
### Config file encoding ### Configuration file encoding
The config files are decoded according to the UTF BOM if present, and in the encoding from system locale otherwise. The configuration files are decoded according to the UTF BOM if present, and in the encoding from system locale otherwise.
If you want your file to be decoded differently, add `# coding: ENCODING` to the beginning of the file (e.g. `# coding: shift-jis`). There must be no characters before that, even spaces or BOM. If you want your file to be decoded differently, add `# coding: ENCODING` to the beginning of the file (e.g. `# coding: shift-jis`). There must be no characters before that, even spaces or BOM.
### Authentication with `.netrc` file ### Authentication with `.netrc` file
You may also want to configure automatic credentials storage for extractors that support authentication (by providing login and password with `--username` and `--password`) in order not to pass credentials as command line arguments on every yt-dlp execution and prevent tracking plain text passwords in the shell command history. You can achieve this using a [`.netrc` file](https://stackoverflow.com/tags/.netrc/info) on a per extractor basis. For that you will need to create a `.netrc` file in `--netrc-location` and restrict permissions to read/write by only you: You may also want to configure automatic credentials storage for extractors that support authentication (by providing login and password with `--username` and `--password`) in order not to pass credentials as command line arguments on every yt-dlp execution and prevent tracking plain text passwords in the shell command history. You can achieve this using a [`.netrc` file](https://stackoverflow.com/tags/.netrc/info) on a per-extractor basis. For that you will need to create a `.netrc` file in `--netrc-location` and restrict permissions to read/write by only you:
``` ```
touch $HOME/.netrc touch ${HOME}/.netrc
chmod a-rwx,u+rw $HOME/.netrc chmod a-rwx,u+rw ${HOME}/.netrc
``` ```
After that you can add credentials for an extractor in the following format, where *extractor* is the name of the extractor in lowercase: After that you can add credentials for an extractor in the following format, where *extractor* is the name of the extractor in lowercase:
``` ```
machine <extractor> login <username> password <password> machine <extractor> login <username> password <password>
``` ```
For example: E.g.
``` ```
machine youtube login myaccount@gmail.com password my_youtube_password machine youtube login myaccount@gmail.com password my_youtube_password
machine twitch login my_twitch_account_name password my_twitch_password machine twitch login my_twitch_account_name password my_twitch_password
``` ```
To activate authentication with the `.netrc` file you should pass `--netrc` to yt-dlp or place it in the [configuration file](#configuration). To activate authentication with the `.netrc` file you should pass `--netrc` to yt-dlp or place it in the [configuration file](#configuration).
The default location of the .netrc file is `$HOME` (`~`) in UNIX. On Windows, it is `%HOME%` if present, `%USERPROFILE%` (generally `C:\Users\<user name>`) or `%HOMEDRIVE%%HOMEPATH%` The default location of the .netrc file is `~` (see below).
### Notes about environment variables
* Environment variables are normally specified as `${VARIABLE}`/`$VARIABLE` on UNIX and `%VARIABLE%` on Windows; but is always shown as `${VARIABLE}` in this documentation
* yt-dlp also allow using UNIX-style variables on Windows for path-like options; e.g. `--output`, `--config-location`
* If unset, `${XDG_CONFIG_HOME}` defaults to `~/.config` and `${XDG_CACHE_HOME}` to `~/.cache`
* On Windows, `~` points to `${HOME}` if present; or, `${USERPROFILE}` or `${HOMEDRIVE}${HOMEPATH}` otherwise
* On Windows, `${USERPROFILE}` generally points to `C:\Users\<user name>` and `${APPDATA}` to `${USERPROFILE}\AppData\Roaming`
# OUTPUT TEMPLATE # OUTPUT TEMPLATE
@ -1196,39 +1175,38 @@ # OUTPUT TEMPLATE
The simplest usage of `-o` is not to set any template arguments when downloading a single file, like in `yt-dlp -o funny_video.flv "https://some/video"` (hard-coding file extension like this is _not_ recommended and could break some post-processing). The simplest usage of `-o` is not to set any template arguments when downloading a single file, like in `yt-dlp -o funny_video.flv "https://some/video"` (hard-coding file extension like this is _not_ recommended and could break some post-processing).
It may however also contain special sequences that will be replaced when downloading each video. The special sequences may be formatted according to [Python string formatting operations](https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting). For example, `%(NAME)s` or `%(NAME)05d`. To clarify, that is a percent symbol followed by a name in parentheses, followed by formatting operations. It may however also contain special sequences that will be replaced when downloading each video. The special sequences may be formatted according to [Python string formatting operations](https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting), e.g. `%(NAME)s` or `%(NAME)05d`. To clarify, that is a percent symbol followed by a name in parentheses, followed by formatting operations.
The field names themselves (the part inside the parenthesis) can also have some special formatting: The field names themselves (the part inside the parenthesis) can also have some special formatting:
1. **Object traversal**: The dictionaries and lists available in metadata can be traversed by using a `.` (dot) separator. You can also do python slicing using `:`. Eg: `%(tags.0)s`, `%(subtitles.en.-1.ext)s`, `%(id.3:7:-1)s`, `%(formats.:.format_id)s`. `%()s` refers to the entire infodict. Note that all the fields that become available using this method are not listed below. Use `-j` to see such fields 1. **Object traversal**: The dictionaries and lists available in metadata can be traversed by using a dot `.` separator; e.g. `%(tags.0)s`, `%(subtitles.en.-1.ext)s`. You can do Python slicing with colon `:`; E.g. `%(id.3:7:-1)s`, `%(formats.:.format_id)s`. Curly braces `{}` can be used to build dictionaries with only specific keys; e.g. `%(formats.:.{format_id,height})#j`. An empty field name `%()s` refers to the entire infodict; e.g. `%(.{id,title})s`. Note that all the fields that become available using this method are not listed below. Use `-j` to see such fields
1. **Addition**: Addition and subtraction of numeric fields can be done using `+` and `-` respectively. Eg: `%(playlist_index+10)03d`, `%(n_entries+1-playlist_index)d` 1. **Addition**: Addition and subtraction of numeric fields can be done using `+` and `-` respectively. E.g. `%(playlist_index+10)03d`, `%(n_entries+1-playlist_index)d`
1. **Date/time Formatting**: Date/time fields can be formatted according to [strftime formatting](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) by specifying it separated from the field name using a `>`. Eg: `%(duration>%H-%M-%S)s`, `%(upload_date>%Y-%m-%d)s`, `%(epoch-3600>%H-%M-%S)s` 1. **Date/time Formatting**: Date/time fields can be formatted according to [strftime formatting](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) by specifying it separated from the field name using a `>`. E.g. `%(duration>%H-%M-%S)s`, `%(upload_date>%Y-%m-%d)s`, `%(epoch-3600>%H-%M-%S)s`
1. **Alternatives**: Alternate fields can be specified separated with a `,`. Eg: `%(release_date>%Y,upload_date>%Y|Unknown)s` 1. **Alternatives**: Alternate fields can be specified separated with a `,`. E.g. `%(release_date>%Y,upload_date>%Y|Unknown)s`
1. **Replacement**: A replacement value can specified using a `&` separator. If the field is *not* empty, this replacement value will be used instead of the actual field content. This is done after alternate fields are considered; thus the replacement is used if *any* of the alternative fields is *not* empty. 1. **Replacement**: A replacement value can be specified using a `&` separator. If the field is *not* empty, this replacement value will be used instead of the actual field content. This is done after alternate fields are considered; thus the replacement is used if *any* of the alternative fields is *not* empty.
1. **Default**: A literal default value can be specified for when the field is empty using a `|` separator. This overrides `--output-na-template`. Eg: `%(uploader|Unknown)s` 1. **Default**: A literal default value can be specified for when the field is empty using a `|` separator. This overrides `--output-na-placeholder`. E.g. `%(uploader|Unknown)s`
1. **More Conversions**: In addition to the normal format types `diouxXeEfFgGcrs`, yt-dlp additionally supports converting to `B` = **B**ytes, `j` = **j**son (flag `#` for pretty-printing), `h` = HTML escaping, `l` = a comma separated **l**ist (flag `#` for `\n` newline-separated), `q` = a string **q**uoted for the terminal (flag `#` to split a list into different arguments), `D` = add **D**ecimal suffixes (Eg: 10M) (flag `#` to use 1024 as factor), and `S` = **S**anitize as filename (flag `#` for restricted) 1. **More Conversions**: In addition to the normal format types `diouxXeEfFgGcrs`, yt-dlp additionally supports converting to `B` = **B**ytes, `j` = **j**son (flag `#` for pretty-printing, `+` for Unicode), `h` = HTML escaping, `l` = a comma separated **l**ist (flag `#` for `\n` newline-separated), `q` = a string **q**uoted for the terminal (flag `#` to split a list into different arguments), `D` = add **D**ecimal suffixes (e.g. 10M) (flag `#` to use 1024 as factor), and `S` = **S**anitize as filename (flag `#` for restricted)
1. **Unicode normalization**: The format type `U` can be used for NFC [unicode normalization](https://docs.python.org/3/library/unicodedata.html#unicodedata.normalize). The alternate form flag (`#`) changes the normalization to NFD and the conversion flag `+` can be used for NFKC/NFKD compatibility equivalence normalization. Eg: `%(title)+.100U` is NFKC 1. **Unicode normalization**: The format type `U` can be used for NFC [Unicode normalization](https://docs.python.org/3/library/unicodedata.html#unicodedata.normalize). The alternate form flag (`#`) changes the normalization to NFD and the conversion flag `+` can be used for NFKC/NFKD compatibility equivalence normalization. E.g. `%(title)+.100U` is NFKC
To summarize, the general syntax for a field is: To summarize, the general syntax for a field is:
``` ```
%(name[.keys][addition][>strf][,alternate][&replacement][|default])[flags][width][.precision][length]type %(name[.keys][addition][>strf][,alternate][&replacement][|default])[flags][width][.precision][length]type
``` ```
Additionally, you can set different output templates for the various metadata files separately from the general output template by specifying the type of file followed by the template separated by a colon `:`. The different file types supported are `subtitle`, `thumbnail`, `description`, `annotation` (deprecated), `infojson`, `link`, `pl_thumbnail`, `pl_description`, `pl_infojson`, `chapter`, `pl_video`. For example, `-o "%(title)s.%(ext)s" -o "thumbnail:%(title)s\%(title)s.%(ext)s"` will put the thumbnails in a folder with the same name as the video. If any of the templates is empty, that type of file will not be written. Eg: `--write-thumbnail -o "thumbnail:"` will write thumbnails only for playlists and not for video. Additionally, you can set different output templates for the various metadata files separately from the general output template by specifying the type of file followed by the template separated by a colon `:`. The different file types supported are `subtitle`, `thumbnail`, `description`, `annotation` (deprecated), `infojson`, `link`, `pl_thumbnail`, `pl_description`, `pl_infojson`, `chapter`, `pl_video`. E.g. `-o "%(title)s.%(ext)s" -o "thumbnail:%(title)s\%(title)s.%(ext)s"` will put the thumbnails in a folder with the same name as the video. If any of the templates is empty, that type of file will not be written. E.g. `--write-thumbnail -o "thumbnail:"` will write thumbnails only for playlists and not for video.
The available fields are: The available fields are:
- `id` (string): Video identifier - `id` (string): Video identifier
- `title` (string): Video title - `title` (string): Video title
- `fulltitle` (string): Video title ignoring live timestamp and generic title - `fulltitle` (string): Video title ignoring live timestamp and generic title
- `url` (string): Video URL
- `ext` (string): Video filename extension - `ext` (string): Video filename extension
- `alt_title` (string): A secondary title of the video - `alt_title` (string): A secondary title of the video
- `description` (string): The description of the video - `description` (string): The description of the video
@ -1250,6 +1228,7 @@ # OUTPUT TEMPLATE
- `duration` (numeric): Length of the video in seconds - `duration` (numeric): Length of the video in seconds
- `duration_string` (string): Length of the video (HH:mm:ss) - `duration_string` (string): Length of the video (HH:mm:ss)
- `view_count` (numeric): How many users have watched the video on the platform - `view_count` (numeric): How many users have watched the video on the platform
- `concurrent_view_count` (numeric): How many users are currently watching the video on the platform.
- `like_count` (numeric): Number of positive ratings of the video - `like_count` (numeric): Number of positive ratings of the video
- `dislike_count` (numeric): Number of negative ratings of the video - `dislike_count` (numeric): Number of negative ratings of the video
- `repost_count` (numeric): Number of reposts of the video - `repost_count` (numeric): Number of reposts of the video
@ -1263,25 +1242,6 @@ # OUTPUT TEMPLATE
- `availability` (string): Whether the video is "private", "premium_only", "subscriber_only", "needs_auth", "unlisted" or "public" - `availability` (string): Whether the video is "private", "premium_only", "subscriber_only", "needs_auth", "unlisted" or "public"
- `start_time` (numeric): Time in seconds where the reproduction should start, as specified in the URL - `start_time` (numeric): Time in seconds where the reproduction should start, as specified in the URL
- `end_time` (numeric): Time in seconds where the reproduction should end, as specified in the URL - `end_time` (numeric): Time in seconds where the reproduction should end, as specified in the URL
- `format` (string): A human-readable description of the format
- `format_id` (string): Format code specified by `--format`
- `format_note` (string): Additional info about the format
- `width` (numeric): Width of the video
- `height` (numeric): Height of the video
- `resolution` (string): Textual description of width and height
- `tbr` (numeric): Average bitrate of audio and video in KBit/s
- `abr` (numeric): Average audio bitrate in KBit/s
- `acodec` (string): Name of the audio codec in use
- `asr` (numeric): Audio sampling rate in Hertz
- `vbr` (numeric): Average video bitrate in KBit/s
- `fps` (numeric): Frame rate
- `dynamic_range` (string): The dynamic range of the video
- `stretched_ratio` (float): `width:height` of the video's pixels, if not square
- `vcodec` (string): Name of the video codec in use
- `container` (string): Name of the container format
- `filesize` (numeric): The number of bytes, if known in advance
- `filesize_approx` (numeric): An estimate for the number of bytes
- `protocol` (string): The protocol that will be used for the actual download
- `extractor` (string): Name of the extractor - `extractor` (string): Name of the extractor
- `extractor_key` (string): Key name of the extractor - `extractor_key` (string): Key name of the extractor
- `epoch` (numeric): Unix epoch of when the information extraction was completed - `epoch` (numeric): Unix epoch of when the information extraction was completed
@ -1301,6 +1261,8 @@ # OUTPUT TEMPLATE
- `webpage_url_domain` (string): The domain of the webpage URL - `webpage_url_domain` (string): The domain of the webpage URL
- `original_url` (string): The URL given by the user (or same as `webpage_url` for playlist entries) - `original_url` (string): The URL given by the user (or same as `webpage_url` for playlist entries)
All the fields in [Filtering Formats](#filtering-formats) can also be used
Available for the video that belongs to some logical chapter or section: Available for the video that belongs to some logical chapter or section:
- `chapter` (string): Name or title of the chapter the video belongs to - `chapter` (string): Name or title of the chapter the video belongs to
@ -1351,18 +1313,19 @@ # OUTPUT TEMPLATE
- `start_time` (numeric): Start time of the chapter in seconds - `start_time` (numeric): Start time of the chapter in seconds
- `end_time` (numeric): End time of the chapter in seconds - `end_time` (numeric): End time of the chapter in seconds
- `categories` (list): The SponsorBlock categories the chapter belongs to - `categories` (list): The [SponsorBlock categories](https://wiki.sponsor.ajay.app/w/Types#Category) the chapter belongs to
- `category` (string): The smallest SponsorBlock category the chapter belongs to - `category` (string): The smallest SponsorBlock category the chapter belongs to
- `category_names` (list): Friendly names of the categories - `category_names` (list): Friendly names of the categories
- `name` (string): Friendly name of the smallest category - `name` (string): Friendly name of the smallest category
- `type` (string): The [SponsorBlock action type](https://wiki.sponsor.ajay.app/w/Types#Action_Type) of the chapter
Each aforementioned sequence when referenced in an output template will be replaced by the actual value corresponding to the sequence name. For example for `-o %(title)s-%(id)s.%(ext)s` and an mp4 video with title `yt-dlp test video` and id `BaW_jenozKc`, this will result in a `yt-dlp test video-BaW_jenozKc.mp4` file created in the current directory. Each aforementioned sequence when referenced in an output template will be replaced by the actual value corresponding to the sequence name. E.g. for `-o %(title)s-%(id)s.%(ext)s` and an mp4 video with title `yt-dlp test video` and id `BaW_jenozKc`, this will result in a `yt-dlp test video-BaW_jenozKc.mp4` file created in the current directory.
Note that some of the sequences are not guaranteed to be present since they depend on the metadata obtained by a particular extractor. Such sequences will be replaced with placeholder value provided with `--output-na-placeholder` (`NA` by default). Note that some of the sequences are not guaranteed to be present since they depend on the metadata obtained by a particular extractor. Such sequences will be replaced with placeholder value provided with `--output-na-placeholder` (`NA` by default).
**Tip**: Look at the `-j` output to identify which fields are available for the particular URL **Tip**: Look at the `-j` output to identify which fields are available for the particular URL
For numeric sequences you can use [numeric related formatting](https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting), for example, `%(view_count)05d` will result in a string with view count padded with zeros up to 5 characters, like in `00042`. For numeric sequences you can use [numeric related formatting](https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting); e.g. `%(view_count)05d` will result in a string with view count padded with zeros up to 5 characters, like in `00042`.
Output templates can also contain arbitrary hierarchical path, e.g. `-o "%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s"` which will result in downloading each video in a directory corresponding to this path template. Any missing directory will be automatically created for you. Output templates can also contain arbitrary hierarchical path, e.g. `-o "%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s"` which will result in downloading each video in a directory corresponding to this path template. Any missing directory will be automatically created for you.
@ -1372,22 +1335,16 @@ # OUTPUT TEMPLATE
In some cases, you don't want special characters such as 中, spaces, or &, such as when transferring the downloaded filename to a Windows system or the filename through an 8bit-unsafe channel. In these cases, add the `--restrict-filenames` flag to get a shorter title. In some cases, you don't want special characters such as 中, spaces, or &, such as when transferring the downloaded filename to a Windows system or the filename through an 8bit-unsafe channel. In these cases, add the `--restrict-filenames` flag to get a shorter title.
<!-- MANPAGE: BEGIN EXCLUDED SECTION -->
#### Output template and Windows batch files
If you are using an output template inside a Windows batch file then you must escape plain percent characters (`%`) by doubling, so that `-o "%(title)s-%(id)s.%(ext)s"` should become `-o "%%(title)s-%%(id)s.%%(ext)s"`. However you should not touch `%`'s that are not plain characters, e.g. environment variables for expansion should stay intact: `-o "C:\%HOMEPATH%\Desktop\%%(title)s.%%(ext)s"`.
<!-- MANPAGE: END EXCLUDED SECTION -->
#### Output template examples #### Output template examples
```bash ```bash
$ yt-dlp --get-filename -o "test video.%(ext)s" BaW_jenozKc $ yt-dlp --print filename -o "test video.%(ext)s" BaW_jenozKc
test video.webm # Literal name with correct extension test video.webm # Literal name with correct extension
$ yt-dlp --get-filename -o "%(title)s.%(ext)s" BaW_jenozKc $ yt-dlp --print filename -o "%(title)s.%(ext)s" BaW_jenozKc
youtube-dl test video ''_ä↭𝕐.webm # All kinds of weird characters youtube-dl test video ''_ä↭𝕐.webm # All kinds of weird characters
$ yt-dlp --get-filename -o "%(title)s.%(ext)s" BaW_jenozKc --restrict-filenames $ yt-dlp --print filename -o "%(title)s.%(ext)s" BaW_jenozKc --restrict-filenames
youtube-dl_test_video_.webm # Restricted file name youtube-dl_test_video_.webm # Restricted file name
# Download YouTube playlist videos in separate directory indexed by video order in a playlist # Download YouTube playlist videos in separate directory indexed by video order in a playlist
@ -1432,7 +1389,7 @@ # FORMAT SELECTION
**tl;dr:** [navigate me to examples](#format-selection-examples). **tl;dr:** [navigate me to examples](#format-selection-examples).
<!-- MANPAGE: END EXCLUDED SECTION --> <!-- MANPAGE: END EXCLUDED SECTION -->
The simplest case is requesting a specific format, for example with `-f 22` you can download the format with format code equal to 22. You can get the list of available format codes for particular video using `--list-formats` or `-F`. Note that these format codes are extractor specific. The simplest case is requesting a specific format; e.g. with `-f 22` you can download the format with format code equal to 22. You can get the list of available format codes for particular video using `--list-formats` or `-F`. Note that these format codes are extractor specific.
You can also use a file extension (currently `3gp`, `aac`, `flv`, `m4a`, `mp3`, `mp4`, `ogg`, `wav`, `webm` are supported) to download the best quality format of a particular file extension served as a single file, e.g. `-f webm` will download the best quality format with the `webm` extension served as a single file. You can also use a file extension (currently `3gp`, `aac`, `flv`, `m4a`, `mp3`, `mp4`, `ogg`, `wav`, `webm` are supported) to download the best quality format of a particular file extension served as a single file, e.g. `-f webm` will download the best quality format with the `webm` extension served as a single file.
@ -1459,15 +1416,15 @@ # FORMAT SELECTION
You can select the n'th best format of a type by using `best<type>.<n>`. For example, `best.2` will select the 2nd best combined format. Similarly, `bv*.3` will select the 3rd best format that contains a video stream. You can select the n'th best format of a type by using `best<type>.<n>`. For example, `best.2` will select the 2nd best combined format. Similarly, `bv*.3` will select the 3rd best format that contains a video stream.
If you want to download multiple videos and they don't have the same formats available, you can specify the order of preference using slashes. Note that formats on the left hand side are preferred, for example `-f 22/17/18` will download format 22 if it's available, otherwise it will download format 17 if it's available, otherwise it will download format 18 if it's available, otherwise it will complain that no suitable formats are available for download. If you want to download multiple videos, and they don't have the same formats available, you can specify the order of preference using slashes. Note that formats on the left hand side are preferred; e.g. `-f 22/17/18` will download format 22 if it's available, otherwise it will download format 17 if it's available, otherwise it will download format 18 if it's available, otherwise it will complain that no suitable formats are available for download.
If you want to download several formats of the same video use a comma as a separator, e.g. `-f 22,17,18` will download all these three formats, of course if they are available. Or a more sophisticated example combined with the precedence feature: `-f 136/137/mp4/bestvideo,140/m4a/bestaudio`. If you want to download several formats of the same video use a comma as a separator, e.g. `-f 22,17,18` will download all these three formats, of course if they are available. Or a more sophisticated example combined with the precedence feature: `-f 136/137/mp4/bestvideo,140/m4a/bestaudio`.
You can merge the video and audio of multiple formats into a single file using `-f <format1>+<format2>+...` (requires ffmpeg installed), for example `-f bestvideo+bestaudio` will download the best video-only format, the best audio-only format and mux them together with ffmpeg. You can merge the video and audio of multiple formats into a single file using `-f <format1>+<format2>+...` (requires ffmpeg installed); e.g. `-f bestvideo+bestaudio` will download the best video-only format, the best audio-only format and mux them together with ffmpeg.
**Deprecation warning**: Since the *below* described behavior is complex and counter-intuitive, this will be removed and multistreams will be enabled by default in the future. A new operator will be instead added to limit formats to single audio/video **Deprecation warning**: Since the *below* described behavior is complex and counter-intuitive, this will be removed and multistreams will be enabled by default in the future. A new operator will be instead added to limit formats to single audio/video
Unless `--video-multistreams` is used, all formats with a video stream except the first one are ignored. Similarly, unless `--audio-multistreams` is used, all formats with an audio stream except the first one are ignored. For example, `-f bestvideo+best+bestaudio --video-multistreams --audio-multistreams` will download and merge all 3 given formats. The resulting file will have 2 video streams and 2 audio streams. But `-f bestvideo+best+bestaudio --no-video-multistreams` will download and merge only `bestvideo` and `bestaudio`. `best` is ignored since another format containing a video stream (`bestvideo`) has already been selected. The order of the formats is therefore important. `-f best+bestaudio --no-audio-multistreams` will download and merge both formats while `-f bestaudio+best --no-audio-multistreams` will ignore `best` and download only `bestaudio`. Unless `--video-multistreams` is used, all formats with a video stream except the first one are ignored. Similarly, unless `--audio-multistreams` is used, all formats with an audio stream except the first one are ignored. E.g. `-f bestvideo+best+bestaudio --video-multistreams --audio-multistreams` will download and merge all 3 given formats. The resulting file will have 2 video streams and 2 audio streams. But `-f bestvideo+best+bestaudio --no-video-multistreams` will download and merge only `bestvideo` and `bestaudio`. `best` is ignored since another format containing a video stream (`bestvideo`) has already been selected. The order of the formats is therefore important. `-f best+bestaudio --no-audio-multistreams` will download only `best` while `-f bestaudio+best --no-audio-multistreams` will ignore `best` and download only `bestaudio`.
## Filtering Formats ## Filtering Formats
@ -1476,6 +1433,7 @@ ## Filtering Formats
The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `>=`, `=` (equals), `!=` (not equals): The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `>=`, `=` (equals), `!=` (not equals):
- `filesize`: The number of bytes, if known in advance - `filesize`: The number of bytes, if known in advance
- `filesize_approx`: An estimate for the number of bytes
- `width`: Width of the video, if known - `width`: Width of the video, if known
- `height`: Height of the video, if known - `height`: Height of the video, if known
- `tbr`: Average bitrate of audio and video in KBit/s - `tbr`: Average bitrate of audio and video in KBit/s
@ -1483,24 +1441,31 @@ ## Filtering Formats
- `vbr`: Average video bitrate in KBit/s - `vbr`: Average video bitrate in KBit/s
- `asr`: Audio sampling rate in Hertz - `asr`: Audio sampling rate in Hertz
- `fps`: Frame rate - `fps`: Frame rate
- `audio_channels`: The number of audio channels
- `stretched_ratio`: `width:height` of the video's pixels, if not square
Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends with), `*=` (contains), `~=` (matches regex) and following string meta fields: Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends with), `*=` (contains), `~=` (matches regex) and following string meta fields:
- `url`: Video URL
- `ext`: File extension - `ext`: File extension
- `acodec`: Name of the audio codec in use - `acodec`: Name of the audio codec in use
- `vcodec`: Name of the video codec in use - `vcodec`: Name of the video codec in use
- `container`: Name of the container format - `container`: Name of the container format
- `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `http_dash_segments`, `m3u8`, or `m3u8_native`) - `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `http_dash_segments`, `m3u8`, or `m3u8_native`)
- `format_id`: A short description of the format
- `language`: Language code - `language`: Language code
- `dynamic_range`: The dynamic range of the video
- `format_id`: A short description of the format
- `format`: A human-readable description of the format
- `format_note`: Additional info about the format
- `resolution`: Textual description of width and height
Any string comparison may be prefixed with negation `!` in order to produce an opposite comparison, e.g. `!*=` (does not contain). The comparand of a string comparison needs to be quoted with either double or single quotes if it contains spaces or special characters other than `._-`. Any string comparison may be prefixed with negation `!` in order to produce an opposite comparison, e.g. `!*=` (does not contain). The comparand of a string comparison needs to be quoted with either double or single quotes if it contains spaces or special characters other than `._-`.
Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the website. Any other field made available by the extractor can also be used for filtering. Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the website. Any other field made available by the extractor can also be used for filtering.
Formats for which the value is not known are excluded unless you put a question mark (`?`) after the operator. You can combine format filters, so `-f "[height<=?720][tbr>500]"` selects up to 720p videos (or videos where the height is not known) with a bitrate of at least 500 KBit/s. You can also use the filters with `all` to download all formats that satisfy the filter. For example, `-f "all[vcodec=none]"` selects all audio-only formats. Formats for which the value is not known are excluded unless you put a question mark (`?`) after the operator. You can combine format filters, so `-f "[height<=?720][tbr>500]"` selects up to 720p videos (or videos where the height is not known) with a bitrate of at least 500 KBit/s. You can also use the filters with `all` to download all formats that satisfy the filter, e.g. `-f "all[vcodec=none]"` selects all audio-only formats.
Format selectors can also be grouped using parentheses, for example if you want to download the best pre-merged mp4 and webm formats with a height lower than 480 you can use `-f "(mp4,webm)[height<480]"`. Format selectors can also be grouped using parentheses; e.g. `-f "(mp4,webm)[height<480]"` will download the best pre-merged mp4 and webm formats with a height lower than 480.
## Sorting Formats ## Sorting Formats
@ -1508,8 +1473,8 @@ ## Sorting Formats
The available fields are: The available fields are:
- `hasvid`: Gives priority to formats that has a video stream - `hasvid`: Gives priority to formats that have a video stream
- `hasaud`: Gives priority to formats that has a audio stream - `hasaud`: Gives priority to formats that have an audio stream
- `ie_pref`: The format preference - `ie_pref`: The format preference
- `lang`: The language preference - `lang`: The language preference
- `quality`: The quality of the format - `quality`: The quality of the format
@ -1519,7 +1484,7 @@ ## Sorting Formats
- `acodec`: Audio Codec (`flac`/`alac` > `wav`/`aiff` > `opus` > `vorbis` > `aac` > `mp4a` > `mp3` > `eac3` > `ac3` > `dts` > other) - `acodec`: Audio Codec (`flac`/`alac` > `wav`/`aiff` > `opus` > `vorbis` > `aac` > `mp4a` > `mp3` > `eac3` > `ac3` > `dts` > other)
- `codec`: Equivalent to `vcodec,acodec` - `codec`: Equivalent to `vcodec,acodec`
- `vext`: Video Extension (`mp4` > `webm` > `flv` > other). If `--prefer-free-formats` is used, `webm` is preferred. - `vext`: Video Extension (`mp4` > `webm` > `flv` > other). If `--prefer-free-formats` is used, `webm` is preferred.
- `aext`: Audio Extension (`m4a` > `aac` > `mp3` > `ogg` > `opus` > `webm` > other). If `--prefer-free-formats` is used, the order changes to `opus` > `ogg` > `webm` > `m4a` > `mp3` > `aac`. - `aext`: Audio Extension (`m4a` > `aac` > `mp3` > `ogg` > `opus` > `webm` > other). If `--prefer-free-formats` is used, the order changes to `ogg` > `opus` > `webm` > `mp3` > `m4a` > `aac`
- `ext`: Equivalent to `vext,aext` - `ext`: Equivalent to `vext,aext`
- `filesize`: Exact filesize, if known in advance - `filesize`: Exact filesize, if known in advance
- `fs_approx`: Approximate filesize calculated from the manifests - `fs_approx`: Approximate filesize calculated from the manifests
@ -1529,6 +1494,7 @@ ## Sorting Formats
- `res`: Video resolution, calculated as the smallest dimension. - `res`: Video resolution, calculated as the smallest dimension.
- `fps`: Framerate of video - `fps`: Framerate of video
- `hdr`: The dynamic range of the video (`DV` > `HDR12` > `HDR10+` > `HDR10` > `HLG` > `SDR`) - `hdr`: The dynamic range of the video (`DV` > `HDR12` > `HDR10+` > `HDR10` > `HLG` > `SDR`)
- `channels`: The number of audio channels
- `tbr`: Total average bitrate in KBit/s - `tbr`: Total average bitrate in KBit/s
- `vbr`: Average video bitrate in KBit/s - `vbr`: Average video bitrate in KBit/s
- `abr`: Average audio bitrate in KBit/s - `abr`: Average audio bitrate in KBit/s
@ -1537,11 +1503,11 @@ ## Sorting Formats
**Deprecation warning**: Many of these fields have (currently undocumented) aliases, that may be removed in a future version. It is recommended to use only the documented field names. **Deprecation warning**: Many of these fields have (currently undocumented) aliases, that may be removed in a future version. It is recommended to use only the documented field names.
All fields, unless specified otherwise, are sorted in descending order. To reverse this, prefix the field with a `+`. Eg: `+res` prefers format with the smallest resolution. Additionally, you can suffix a preferred value for the fields, separated by a `:`. Eg: `res:720` prefers larger videos, but no larger than 720p and the smallest video if there are no videos less than 720p. For `codec` and `ext`, you can provide two preferred values, the first for video and the second for audio. Eg: `+codec:avc:m4a` (equivalent to `+vcodec:avc,+acodec:m4a`) sets the video codec preference to `h264` > `h265` > `vp9` > `vp9.2` > `av01` > `vp8` > `h263` > `theora` and audio codec preference to `mp4a` > `aac` > `vorbis` > `opus` > `mp3` > `ac3` > `dts`. You can also make the sorting prefer the nearest values to the provided by using `~` as the delimiter. Eg: `filesize~1G` prefers the format with filesize closest to 1 GiB. All fields, unless specified otherwise, are sorted in descending order. To reverse this, prefix the field with a `+`. E.g. `+res` prefers format with the smallest resolution. Additionally, you can suffix a preferred value for the fields, separated by a `:`. E.g. `res:720` prefers larger videos, but no larger than 720p and the smallest video if there are no videos less than 720p. For `codec` and `ext`, you can provide two preferred values, the first for video and the second for audio. E.g. `+codec:avc:m4a` (equivalent to `+vcodec:avc,+acodec:m4a`) sets the video codec preference to `h264` > `h265` > `vp9` > `vp9.2` > `av01` > `vp8` > `h263` > `theora` and audio codec preference to `mp4a` > `aac` > `vorbis` > `opus` > `mp3` > `ac3` > `dts`. You can also make the sorting prefer the nearest values to the provided by using `~` as the delimiter. E.g. `filesize~1G` prefers the format with filesize closest to 1 GiB.
The fields `hasvid` and `ie_pref` are always given highest priority in sorting, irrespective of the user-defined order. This behaviour can be changed by using `--format-sort-force`. Apart from these, the default order used is: `lang,quality,res,fps,hdr:12,codec:vp9.2,size,br,asr,proto,ext,hasaud,source,id`. The extractors may override this default order, but they cannot override the user-provided order. The fields `hasvid` and `ie_pref` are always given highest priority in sorting, irrespective of the user-defined order. This behaviour can be changed by using `--format-sort-force`. Apart from these, the default order used is: `lang,quality,res,fps,hdr:12,vcodec:vp9.2,channels,acodec,size,br,asr,proto,ext,hasaud,source,id`. The extractors may override this default order, but they cannot override the user-provided order.
Note that the default has `codec:vp9.2`; i.e. `av1` is not preferred. Similarly, the default for hdr is `hdr:12`; i.e. dolby vision is not preferred. These choices are made since DV and AV1 formats are not yet fully compatible with most devices. This may be changed in the future as more devices become capable of smoothly playing back these formats. Note that the default has `vcodec:vp9.2`; i.e. `av1` is not preferred. Similarly, the default for hdr is `hdr:12`; i.e. dolby vision is not preferred. These choices are made since DV and AV1 formats are not yet fully compatible with most devices. This may be changed in the future as more devices become capable of smoothly playing back these formats.
If your format selector is `worst`, the last item is selected after sorting. This means it will select the format that is worst in all respects. Most of the time, what you actually want is the video with the smallest filesize instead. So it is generally better to use `-f best -S +size,+br,+res,+fps`. If your format selector is `worst`, the last item is selected after sorting. This means it will select the format that is worst in all respects. Most of the time, what you actually want is the video with the smallest filesize instead. So it is generally better to use `-f best -S +size,+br,+res,+fps`.
@ -1678,13 +1644,13 @@ # MODIFYING METADATA
The general syntax of `--parse-metadata FROM:TO` is to give the name of a field or an [output template](#output-template) to extract data from, and the format to interpret it as, separated by a colon `:`. Either a [python regular expression](https://docs.python.org/3/library/re.html#regular-expression-syntax) with named capture groups or a similar syntax to the [output template](#output-template) (only `%(field)s` formatting is supported) can be used for `TO`. The option can be used multiple times to parse and modify various fields. The general syntax of `--parse-metadata FROM:TO` is to give the name of a field or an [output template](#output-template) to extract data from, and the format to interpret it as, separated by a colon `:`. Either a [python regular expression](https://docs.python.org/3/library/re.html#regular-expression-syntax) with named capture groups or a similar syntax to the [output template](#output-template) (only `%(field)s` formatting is supported) can be used for `TO`. The option can be used multiple times to parse and modify various fields.
Note that any field created by this can be used in the [output template](#output-template) and will also affect the media file's metadata added when using `--add-metadata`. Note that any field created by this can be used in the [output template](#output-template) and will also affect the media file's metadata added when using `--embed-metadata`.
This option also has a few special uses: This option also has a few special uses:
* You can download an additional URL based on the metadata of the currently downloaded video. To do this, set the field `additional_urls` to the URL that you want to download. Eg: `--parse-metadata "description:(?P<additional_urls>https?://www\.vimeo\.com/\d+)` will download the first vimeo video found in the description * You can download an additional URL based on the metadata of the currently downloaded video. To do this, set the field `additional_urls` to the URL that you want to download. E.g. `--parse-metadata "description:(?P<additional_urls>https?://www\.vimeo\.com/\d+)` will download the first vimeo video found in the description
* You can use this to change the metadata that is embedded in the media file. To do this, set the value of the corresponding field with a `meta_` prefix. For example, any value you set to `meta_description` field will be added to the `description` field in the file. For example, you can use this to set a different "description" and "synopsis". To modify the metadata of individual streams, use the `meta<n>_` prefix (Eg: `meta1_language`). Any value set to the `meta_` field will overwrite all default values. * You can use this to change the metadata that is embedded in the media file. To do this, set the value of the corresponding field with a `meta_` prefix. For example, any value you set to `meta_description` field will be added to the `description` field in the file - you can use this to set a different "description" and "synopsis". To modify the metadata of individual streams, use the `meta<n>_` prefix (e.g. `meta1_language`). Any value set to the `meta_` field will overwrite all default values.
**Note**: Metadata modification happens before format selection, post-extraction and other post-processing operations. Some fields may be added or changed during these steps, overriding your changes. **Note**: Metadata modification happens before format selection, post-extraction and other post-processing operations. Some fields may be added or changed during these steps, overriding your changes.
@ -1724,11 +1690,11 @@ # Set title as "Series name S01E05"
$ yt-dlp --parse-metadata "%(series)s S%(season_number)02dE%(episode_number)02d:%(title)s" $ yt-dlp --parse-metadata "%(series)s S%(season_number)02dE%(episode_number)02d:%(title)s"
# Prioritize uploader as the "artist" field in video metadata # Prioritize uploader as the "artist" field in video metadata
$ yt-dlp --parse-metadata "%(uploader|)s:%(meta_artist)s" --add-metadata $ yt-dlp --parse-metadata "%(uploader|)s:%(meta_artist)s" --embed-metadata
# Set "comment" field in video metadata using description instead of webpage_url, # Set "comment" field in video metadata using description instead of webpage_url,
# handling multiple lines correctly # handling multiple lines correctly
$ yt-dlp --parse-metadata "description:(?s)(?P<meta_comment>.+)" --add-metadata $ yt-dlp --parse-metadata "description:(?s)(?P<meta_comment>.+)" --embed-metadata
# Do not set any "synopsis" in the video metadata # Do not set any "synopsis" in the video metadata
$ yt-dlp --parse-metadata ":(?P<meta_synopsis>)" $ yt-dlp --parse-metadata ":(?P<meta_synopsis>)"
@ -1743,39 +1709,37 @@ # Replace all spaces and "_" in title and uploader with a `-`
# EXTRACTOR ARGUMENTS # EXTRACTOR ARGUMENTS
Some extractors accept additional arguments which can be passed using `--extractor-args KEY:ARGS`. `ARGS` is a `;` (semicolon) separated string of `ARG=VAL1,VAL2`. Eg: `--extractor-args "youtube:player-client=android_embedded,web;include_live_dash" --extractor-args "funimation:version=uncut"` Some extractors accept additional arguments which can be passed using `--extractor-args KEY:ARGS`. `ARGS` is a `;` (semicolon) separated string of `ARG=VAL1,VAL2`. E.g. `--extractor-args "youtube:player-client=android_embedded,web;include_live_dash" --extractor-args "funimation:version=uncut"`
The following extractors use this feature: The following extractors use this feature:
#### youtube #### youtube
* `lang`: Language code to prefer translated metadata of this language (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively * `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
* `player_client`: Clients to extract video data from. The main clients are `web`, `android` and `ios` with variants `_music`, `_embedded`, `_embedscreen`, `_creator` (Eg: `web_embedded`); and `mweb` and `tv_embedded` (agegate bypass) with no variants. By default, `android,web` is used, but `tv_embedded` and `creator` variants are added as required for age-gated videos. Similarly the music variants are added for `music.youtube.com` urls. You can use `all` to use all the clients, and `default` for the default clients. * `player_client`: Clients to extract video data from. The main clients are `web`, `android` and `ios` with variants `_music`, `_embedded`, `_embedscreen`, `_creator` (e.g. `web_embedded`); and `mweb` and `tv_embedded` (agegate bypass) with no variants. By default, `android,web` is used, but `tv_embedded` and `creator` variants are added as required for age-gated videos. Similarly, the music variants are added for `music.youtube.com` urls. You can use `all` to use all the clients, and `default` for the default clients.
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause some issues. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) for more details * `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause some issues. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) for more details
* `include_live_dash`: Include live dash formats even without `--live-from-start` (These formats don't download properly)
* `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side) * `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side)
* `max_comments`: Limit the amount of comments to gather. Comma-separated list of integers representing `max-comments,max-parents,max-replies,max-replies-per-thread`. Default is `all,all,all,all` * `max_comments`: Limit the amount of comments to gather. Comma-separated list of integers representing `max-comments,max-parents,max-replies,max-replies-per-thread`. Default is `all,all,all,all`
* E.g. `all,all,1000,10` will get a maximum of 1000 replies total, with up to 10 replies per thread. `1000,all,100` will get a maximum of 1000 comments, with a maximum of 100 replies total * E.g. `all,all,1000,10` will get a maximum of 1000 replies total, with up to 10 replies per thread. `1000,all,100` will get a maximum of 1000 comments, with a maximum of 100 replies total
* `innertube_host`: Innertube API host to use for all API requests * `include_incomplete_formats`: Extract formats that cannot be downloaded completely (live dash and post-live m3u8)
* e.g. `studio.youtube.com`, `youtubei.googleapis.com` * `innertube_host`: Innertube API host to use for all API requests; e.g. `studio.youtube.com`, `youtubei.googleapis.com`. Note that cookies exported from one subdomain will not work on others
* Note: Cookies exported from `www.youtube.com` will not work with hosts other than `*.youtube.com`
* `innertube_key`: Innertube API key to use for all API requests * `innertube_key`: Innertube API key to use for all API requests
#### youtubetab (YouTube playlists, channels, feeds, etc.) #### youtubetab (YouTube playlists, channels, feeds, etc.)
* `skip`: One or more of `webpage` (skip initial webpage download), `authcheck` (allow the download of playlists requiring authentication when no initial webpage is downloaded. This may cause unwanted behavior, see [#1122](https://github.com/yt-dlp/yt-dlp/pull/1122) for more details) * `skip`: One or more of `webpage` (skip initial webpage download), `authcheck` (allow the download of playlists requiring authentication when no initial webpage is downloaded. This may cause unwanted behavior, see [#1122](https://github.com/yt-dlp/yt-dlp/pull/1122) for more details)
* `approximate_date`: Extract approximate `upload_date` in flat-playlist. This may cause date-based filters to be slightly off * `approximate_date`: Extract approximate `upload_date` and `timestamp` in flat-playlist. This may cause date-based filters to be slightly off
#### funimation #### funimation
* `language`: Languages to extract. Eg: `funimation:language=english,japanese` * `language`: Audio languages to extract, e.g. `funimation:language=english,japanese`
* `version`: The video version to extract - `uncut` or `simulcast` * `version`: The video version to extract - `uncut` or `simulcast`
#### crunchyroll #### crunchyroll
* `language`: Languages to extract. Eg: `crunchyroll:language=jaJp` * `language`: Audio languages to extract, e.g. `crunchyroll:language=jaJp`
* `hardsub`: Which hard-sub versions to extract. Eg: `crunchyroll:hardsub=None,enUS` * `hardsub`: Which hard-sub versions to extract, e.g. `crunchyroll:hardsub=None,enUS`
#### crunchyrollbeta #### crunchyrollbeta
* `format`: Which stream type(s) to extract. Default is `adaptive_hls` Eg: `crunchyrollbeta:format=vo_adaptive_hls` * `format`: Which stream type(s) to extract (default: `adaptive_hls`). Potentially useful values include `adaptive_hls`, `adaptive_dash`, `vo_adaptive_hls`, `vo_adaptive_dash`, `download_hls`, `download_dash`, `multitrack_adaptive_hls_v2`
* Potentially useful values include `adaptive_hls`, `adaptive_dash`, `vo_adaptive_hls`, `vo_adaptive_dash`, `download_hls`, `download_dash`, `multitrack_adaptive_hls_v2` * `hardsub`: Preference order for which hardsub versions to extract, or `all` (default: `None` = no hardsubs), e.g. `crunchyrollbeta:hardsub=en-US,None`
* `hardsub`: Preference order for which hardsub versions to extract. Default is `None` (no hardsubs). Eg: `crunchyrollbeta:hardsub=en-US,None`
#### vikichannel #### vikichannel
* `video_types`: Types of videos to download - one or more of `episodes`, `movies`, `clips`, `trailers` * `video_types`: Types of videos to download - one or more of `episodes`, `movies`, `clips`, `trailers`
@ -1795,11 +1759,11 @@ #### hotstar
* `dr`: dynamic range to ignore - one or more of `sdr`, `hdr10`, `dv` * `dr`: dynamic range to ignore - one or more of `sdr`, `hdr10`, `dv`
#### tiktok #### tiktok
* `app_version`: App version to call mobile APIs with - should be set along with `manifest_app_version`. (e.g. `20.2.1`) * `app_version`: App version to call mobile APIs with - should be set along with `manifest_app_version`, e.g. `20.2.1`
* `manifest_app_version`: Numeric app version to call mobile APIs with. (e.g. `221`) * `manifest_app_version`: Numeric app version to call mobile APIs with, e.g. `221`
#### rokfinchannel #### rokfinchannel
* `tab`: Which tab to download. One of `new`, `top`, `videos`, `podcasts`, `streams`, `stacks`. (E.g. `rokfinchannel:tab=streams`) * `tab`: Which tab to download - one of `new`, `top`, `videos`, `podcasts`, `streams`, `stacks`
NOTE: These options may be changed/removed in the future without concern for backward compatibility NOTE: These options may be changed/removed in the future without concern for backward compatibility
@ -1819,6 +1783,8 @@ # PLUGINS
If you are a plugin author, add [ytdlp-plugins](https://github.com/topics/ytdlp-plugins) as a topic to your repository for discoverability If you are a plugin author, add [ytdlp-plugins](https://github.com/topics/ytdlp-plugins) as a topic to your repository for discoverability
See the [wiki for some known plugins](https://github.com/yt-dlp/yt-dlp/wiki/Plugins)
# EMBEDDING YT-DLP # EMBEDDING YT-DLP
@ -2058,12 +2024,13 @@ #### Redundant options
#### Not recommended #### Not recommended
While these options still work, their use is not recommended since there are other alternatives to achieve the same While these options still work, their use is not recommended since there are other alternatives to achieve the same
--force-generic-extractor --ies generic,default
--exec-before-download CMD --exec "before_dl:CMD" --exec-before-download CMD --exec "before_dl:CMD"
--no-exec-before-download --no-exec --no-exec-before-download --no-exec
--all-formats -f all --all-formats -f all
--all-subs --sub-langs all --write-subs --all-subs --sub-langs all --write-subs
--print-json -j --no-simulate --print-json -j --no-simulate
--autonumber-size NUMBER Use string formatting. Eg: %(autonumber)03d --autonumber-size NUMBER Use string formatting, e.g. %(autonumber)03d
--autonumber-start NUMBER Use internal field formatting like %(autonumber+NUMBER)s --autonumber-start NUMBER Use internal field formatting like %(autonumber+NUMBER)s
--id -o "%(id)s.%(ext)s" --id -o "%(id)s.%(ext)s"
--metadata-from-title FORMAT --parse-metadata "%(title)s:FORMAT" --metadata-from-title FORMAT --parse-metadata "%(title)s:FORMAT"
@ -2142,5 +2109,5 @@ #### Removed
# CONTRIBUTING # CONTRIBUTING
See [CONTRIBUTING.md](CONTRIBUTING.md#contributing-to-yt-dlp) for instructions on [Opening an Issue](CONTRIBUTING.md#opening-an-issue) and [Contributing code to the project](CONTRIBUTING.md#developer-instructions) See [CONTRIBUTING.md](CONTRIBUTING.md#contributing-to-yt-dlp) for instructions on [Opening an Issue](CONTRIBUTING.md#opening-an-issue) and [Contributing code to the project](CONTRIBUTING.md#developer-instructions)
# MORE # WIKI
For FAQ see the [youtube-dl README](https://github.com/ytdl-org/youtube-dl#faq) See the [Wiki](https://github.com/yt-dlp/yt-dlp/wiki) for more information

1
devscripts/__init__.py Normal file
View File

@ -0,0 +1 @@
# Empty file needed to make devscripts.utils properly importable from outside

View File

@ -11,13 +11,16 @@
# These bloat the lazy_extractors, so allow them to passthrough silently # These bloat the lazy_extractors, so allow them to passthrough silently
ALLOWED_CLASSMETHODS = {'get_testcases', 'extract_from_webpage'} ALLOWED_CLASSMETHODS = {'get_testcases', 'extract_from_webpage'}
_WARNED = False
class LazyLoadMetaClass(type): class LazyLoadMetaClass(type):
def __getattr__(cls, name): def __getattr__(cls, name):
if '_real_class' not in cls.__dict__ and name not in ALLOWED_CLASSMETHODS: global _WARNED
write_string( if ('_real_class' not in cls.__dict__
'WARNING: Falling back to normal extractor since lazy extractor ' and name not in ALLOWED_CLASSMETHODS and not _WARNED):
_WARNED = True
write_string('WARNING: Falling back to normal extractor since lazy extractor '
f'{cls.__name__} does not have attribute {name}{bug_reports_message()}\n') f'{cls.__name__} does not have attribute {name}{bug_reports_message()}\n')
return getattr(cls.real_class, name) return getattr(cls.real_class, name)

View File

@ -7,20 +7,14 @@
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import optparse
import re import re
from devscripts.utils import (
def read(fname): get_filename_args,
with open(fname, encoding='utf-8') as f: read_file,
return f.read() read_version,
write_file,
)
# Get the version without importing the package
def read_version(fname):
exec(compile(read(fname), fname, 'exec'))
return locals()['__version__']
VERBOSE_TMPL = ''' VERBOSE_TMPL = '''
- type: checkboxes - type: checkboxes
@ -58,20 +52,24 @@ def read_version(fname):
required: true required: true
'''.strip() '''.strip()
NO_SKIP = '''
- type: checkboxes
attributes:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
description: Fill all fields even if you think it is irrelevant for the issue
options:
- label: I understand that I will be **blocked** if I remove or skip any mandatory\\* field
required: true
'''.strip()
def main(): def main():
parser = optparse.OptionParser(usage='%prog INFILE OUTFILE') fields = {'version': read_version(), 'no_skip': NO_SKIP}
_, args = parser.parse_args()
if len(args) != 2:
parser.error('Expected an input and an output filename')
fields = {'version': read_version('yt_dlp/version.py')}
fields['verbose'] = VERBOSE_TMPL % fields fields['verbose'] = VERBOSE_TMPL % fields
fields['verbose_optional'] = re.sub(r'(\n\s+validations:)?\n\s+required: true', '', fields['verbose']) fields['verbose_optional'] = re.sub(r'(\n\s+validations:)?\n\s+required: true', '', fields['verbose'])
infile, outfile = args infile, outfile = get_filename_args(has_infile=True)
with open(outfile, 'w', encoding='utf-8') as outf: write_file(outfile, read_file(infile) % fields)
outf.write(read(infile) % fields)
if __name__ == '__main__': if __name__ == '__main__':

View File

@ -2,16 +2,20 @@
# Allow direct execution # Allow direct execution
import os import os
import shutil
import sys import sys
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import optparse
from inspect import getsource from inspect import getsource
from devscripts.utils import get_filename_args, read_file, write_file
NO_ATTR = object() NO_ATTR = object()
STATIC_CLASS_PROPERTIES = ['IE_NAME', 'IE_DESC', 'SEARCH_KEY', '_VALID_URL', '_WORKING', '_NETRC_MACHINE', 'age_limit'] STATIC_CLASS_PROPERTIES = [
'IE_NAME', 'IE_DESC', 'SEARCH_KEY', '_VALID_URL', '_WORKING', '_ENABLED', '_NETRC_MACHINE', 'age_limit'
]
CLASS_METHODS = [ CLASS_METHODS = [
'ie_key', 'working', 'description', 'suitable', '_match_valid_url', '_match_id', 'get_temp_id', 'is_suitable' 'ie_key', 'working', 'description', 'suitable', '_match_valid_url', '_match_id', 'get_temp_id', 'is_suitable'
] ]
@ -19,17 +23,11 @@
class {name}({bases}): class {name}({bases}):
_module = {module!r} _module = {module!r}
''' '''
with open('devscripts/lazy_load_template.py', encoding='utf-8') as f: MODULE_TEMPLATE = read_file('devscripts/lazy_load_template.py')
MODULE_TEMPLATE = f.read()
def main(): def main():
parser = optparse.OptionParser(usage='%prog [OUTFILE.py]') lazy_extractors_filename = get_filename_args(default_outfile='yt_dlp/extractor/lazy_extractors.py')
args = parser.parse_args()[1] or ['yt_dlp/extractor/lazy_extractors.py']
if len(args) != 1:
parser.error('Expected only an output filename')
lazy_extractors_filename = args[0]
if os.path.exists(lazy_extractors_filename): if os.path.exists(lazy_extractors_filename):
os.remove(lazy_extractors_filename) os.remove(lazy_extractors_filename)
@ -46,20 +44,20 @@ def main():
*build_ies(_ALL_CLASSES, (InfoExtractor, SearchInfoExtractor), DummyInfoExtractor), *build_ies(_ALL_CLASSES, (InfoExtractor, SearchInfoExtractor), DummyInfoExtractor),
)) ))
with open(lazy_extractors_filename, 'wt', encoding='utf-8') as f: write_file(lazy_extractors_filename, f'{module_src}\n')
f.write(f'{module_src}\n')
def get_all_ies(): def get_all_ies():
PLUGINS_DIRNAME = 'ytdlp_plugins' PLUGINS_DIRNAME = 'ytdlp_plugins'
BLOCKED_DIRNAME = f'{PLUGINS_DIRNAME}_blocked' BLOCKED_DIRNAME = f'{PLUGINS_DIRNAME}_blocked'
if os.path.exists(PLUGINS_DIRNAME): if os.path.exists(PLUGINS_DIRNAME):
os.rename(PLUGINS_DIRNAME, BLOCKED_DIRNAME) # os.rename cannot be used, e.g. in Docker. See https://github.com/yt-dlp/yt-dlp/pull/4958
shutil.move(PLUGINS_DIRNAME, BLOCKED_DIRNAME)
try: try:
from yt_dlp.extractor.extractors import _ALL_CLASSES from yt_dlp.extractor.extractors import _ALL_CLASSES
finally: finally:
if os.path.exists(BLOCKED_DIRNAME): if os.path.exists(BLOCKED_DIRNAME):
os.rename(BLOCKED_DIRNAME, PLUGINS_DIRNAME) shutil.move(BLOCKED_DIRNAME, PLUGINS_DIRNAME)
return _ALL_CLASSES return _ALL_CLASSES

View File

@ -5,10 +5,17 @@
This must be run in a console of correct width This must be run in a console of correct width
""" """
# Allow direct execution
import os
import sys
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import functools import functools
import re import re
import sys
from devscripts.utils import read_file, write_file
README_FILE = 'README.md' README_FILE = 'README.md'
@ -38,6 +45,10 @@ def apply_patch(text, patch):
delim = f'\n{" " * switch_col_width}' delim = f'\n{" " * switch_col_width}'
PATCHES = ( PATCHES = (
( # Standardize update message
r'(?m)^( -U, --update\s+).+(\n \s.+)*$',
r'\1Update this program to the latest version',
),
( # Headings ( # Headings
r'(?m)^ (\w.+\n)( (?=\w))?', r'(?m)^ (\w.+\n)( (?=\w))?',
r'## \1' r'## \1'
@ -63,11 +74,9 @@ def apply_patch(text, patch):
), ),
) )
with open(README_FILE, encoding='utf-8') as f: readme = read_file(README_FILE)
readme = f.read()
with open(README_FILE, 'w', encoding='utf-8') as f: write_file(README_FILE, ''.join((
f.write(''.join((
take_section(readme, end=f'## {OPTIONS_START}'), take_section(readme, end=f'## {OPTIONS_START}'),
functools.reduce(apply_patch, PATCHES, options), functools.reduce(apply_patch, PATCHES, options),
take_section(readme, f'# {OPTIONS_END}'), take_section(readme, f'# {OPTIONS_END}'),

View File

@ -7,21 +7,13 @@
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import optparse from devscripts.utils import get_filename_args, write_file
from yt_dlp.extractor import list_extractor_classes from yt_dlp.extractor import list_extractor_classes
def main(): def main():
parser = optparse.OptionParser(usage='%prog OUTFILE.md')
_, args = parser.parse_args()
if len(args) != 1:
parser.error('Expected an output filename')
out = '\n'.join(ie.description() for ie in list_extractor_classes() if ie.IE_DESC is not False) out = '\n'.join(ie.description() for ie in list_extractor_classes() if ie.IE_DESC is not False)
write_file(get_filename_args(), f'# Supported sites\n{out}\n')
with open(args[0], 'w', encoding='utf-8') as outf:
outf.write(f'# Supported sites\n{out}\n')
if __name__ == '__main__': if __name__ == '__main__':

View File

@ -1,9 +1,22 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
import optparse # Allow direct execution
import os
import sys
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import os.path import os.path
import re import re
from devscripts.utils import (
compose_functions,
get_filename_args,
read_file,
write_file,
)
ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
README_FILE = os.path.join(ROOT_DIR, 'README.md') README_FILE = os.path.join(ROOT_DIR, 'README.md')
@ -22,25 +35,6 @@
''' '''
def main():
parser = optparse.OptionParser(usage='%prog OUTFILE.md')
_, args = parser.parse_args()
if len(args) != 1:
parser.error('Expected an output filename')
outfile, = args
with open(README_FILE, encoding='utf-8') as f:
readme = f.read()
readme = filter_excluded_sections(readme)
readme = move_sections(readme)
readme = filter_options(readme)
with open(outfile, 'w', encoding='utf-8') as outf:
outf.write(PREFIX + readme)
def filter_excluded_sections(readme): def filter_excluded_sections(readme):
EXCLUDED_SECTION_BEGIN_STRING = re.escape('<!-- MANPAGE: BEGIN EXCLUDED SECTION -->') EXCLUDED_SECTION_BEGIN_STRING = re.escape('<!-- MANPAGE: BEGIN EXCLUDED SECTION -->')
EXCLUDED_SECTION_END_STRING = re.escape('<!-- MANPAGE: END EXCLUDED SECTION -->') EXCLUDED_SECTION_END_STRING = re.escape('<!-- MANPAGE: END EXCLUDED SECTION -->')
@ -92,5 +86,12 @@ def filter_options(readme):
return readme.replace(section, options, 1) return readme.replace(section, options, 1)
TRANSFORM = compose_functions(filter_excluded_sections, move_sections, filter_options)
def main():
write_file(get_filename_args(), PREFIX + TRANSFORM(read_file(README_FILE)))
if __name__ == '__main__': if __name__ == '__main__':
main() main()

View File

@ -1,13 +1,13 @@
#!/usr/bin/env sh #!/usr/bin/env sh
if [ -z $1 ]; then if [ -z "$1" ]; then
test_set='test' test_set='test'
elif [ $1 = 'core' ]; then elif [ "$1" = 'core' ]; then
test_set="-m not download" test_set="-m not download"
elif [ $1 = 'download' ]; then elif [ "$1" = 'download' ]; then
test_set="-m download" test_set="-m download"
else else
echo 'Invalid test type "'$1'". Use "core" | "download"' echo 'Invalid test type "'"$1"'". Use "core" | "download"'
exit 1 exit 1
fi fi

36
devscripts/set-variant.py Normal file
View File

@ -0,0 +1,36 @@
#!/usr/bin/env python3
# Allow direct execution
import os
import sys
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import argparse
import functools
import re
from devscripts.utils import compose_functions, read_file, write_file
VERSION_FILE = 'yt_dlp/version.py'
def parse_options():
parser = argparse.ArgumentParser(description='Set the build variant of the package')
parser.add_argument('variant', help='Name of the variant')
parser.add_argument('-M', '--update-message', default=None, help='Message to show in -U')
return parser.parse_args()
def property_setter(name, value):
return functools.partial(re.sub, rf'(?m)^{name}\s*=\s*.+$', f'{name} = {value!r}')
opts = parse_options()
transform = compose_functions(
property_setter('VARIANT', opts.variant),
property_setter('UPDATE_HINT', opts.update_message)
)
write_file(VERSION_FILE, transform(read_file(VERSION_FILE)))

View File

@ -1,5 +1,10 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
"""
Usage: python3 ./devscripts/update-formulae.py <path-to-formulae-rb> <version>
version can be either 0-aligned (yt-dlp version) or normalized (PyPi version)
"""
# Allow direct execution # Allow direct execution
import os import os
import sys import sys
@ -11,8 +16,7 @@
import re import re
import urllib.request import urllib.request
# usage: python3 ./devscripts/update-formulae.py <path-to-formulae-rb> <version> from devscripts.utils import read_file, write_file
# version can be either 0-aligned (yt-dlp version) or normalized (PyPl version)
filename, version = sys.argv[1:] filename, version = sys.argv[1:]
@ -27,11 +31,9 @@
sha256sum = tarball_file['digests']['sha256'] sha256sum = tarball_file['digests']['sha256']
url = tarball_file['url'] url = tarball_file['url']
with open(filename) as r: formulae_text = read_file(filename)
formulae_text = r.read()
formulae_text = re.sub(r'sha256 "[0-9a-f]*?"', 'sha256 "%s"' % sha256sum, formulae_text, count=1) formulae_text = re.sub(r'sha256 "[0-9a-f]*?"', 'sha256 "%s"' % sha256sum, formulae_text, count=1)
formulae_text = re.sub(r'url "[^"]*?"', 'url "%s"' % url, formulae_text, count=1) formulae_text = re.sub(r'url "[^"]*?"', 'url "%s"' % url, formulae_text, count=1)
with open(filename, 'w') as w: write_file(filename, formulae_text)
w.write(formulae_text)

View File

@ -7,32 +7,35 @@
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import contextlib
import subprocess import subprocess
import sys import sys
from datetime import datetime from datetime import datetime
with open('yt_dlp/version.py') as f: from devscripts.utils import read_version, write_file
exec(compile(f.read(), 'yt_dlp/version.py', 'exec'))
old_version = locals()['__version__']
old_version_list = old_version.split('.')
old_ver = '.'.join(old_version_list[:3]) def get_new_version(revision):
old_rev = old_version_list[3] if len(old_version_list) > 3 else '' version = datetime.utcnow().strftime('%Y.%m.%d')
ver = datetime.utcnow().strftime("%Y.%m.%d") if revision:
assert revision.isdigit(), 'Revision must be a number'
else:
old_version = read_version().split('.')
if version.split('.') == old_version[:3]:
revision = str(int((old_version + [0])[3]) + 1)
rev = (sys.argv[1:] or [''])[0] # Use first argument, if present as revision number return f'{version}.{revision}' if revision else version
if not rev:
rev = str(int(old_rev or 0) + 1) if old_ver == ver else ''
VERSION = '.'.join((ver, rev)) if rev else ver
try: def get_git_head():
with contextlib.suppress(Exception):
sp = subprocess.Popen(['git', 'rev-parse', '--short', 'HEAD'], stdout=subprocess.PIPE) sp = subprocess.Popen(['git', 'rev-parse', '--short', 'HEAD'], stdout=subprocess.PIPE)
GIT_HEAD = sp.communicate()[0].decode().strip() or None return sp.communicate()[0].decode().strip() or None
except Exception:
GIT_HEAD = None
VERSION = get_new_version((sys.argv + [''])[1])
GIT_HEAD = get_git_head()
VERSION_FILE = f'''\ VERSION_FILE = f'''\
# Autogenerated by devscripts/update-version.py # Autogenerated by devscripts/update-version.py
@ -40,10 +43,12 @@
__version__ = {VERSION!r} __version__ = {VERSION!r}
RELEASE_GIT_HEAD = {GIT_HEAD!r} RELEASE_GIT_HEAD = {GIT_HEAD!r}
VARIANT = None
UPDATE_HINT = None
''' '''
with open('yt_dlp/version.py', 'wt') as f: write_file('yt_dlp/version.py', VERSION_FILE)
f.write(VERSION_FILE) print(f'::set-output name=ytdlp_version::{VERSION}')
print('::set-output name=ytdlp_version::' + VERSION)
print(f'\nVersion = {VERSION}, Git HEAD = {GIT_HEAD}') print(f'\nVersion = {VERSION}, Git HEAD = {GIT_HEAD}')

35
devscripts/utils.py Normal file
View File

@ -0,0 +1,35 @@
import argparse
import functools
def read_file(fname):
with open(fname, encoding='utf-8') as f:
return f.read()
def write_file(fname, content):
with open(fname, 'w', encoding='utf-8') as f:
return f.write(content)
# Get the version without importing the package
def read_version(fname='yt_dlp/version.py'):
exec(compile(read_file(fname), fname, 'exec'))
return locals()['__version__']
def get_filename_args(has_infile=False, default_outfile=None):
parser = argparse.ArgumentParser()
if has_infile:
parser.add_argument('infile', help='Input file')
kwargs = {'nargs': '?', 'default': default_outfile} if default_outfile else {}
parser.add_argument('outfile', **kwargs, help='Output file')
opts = parser.parse_args()
if has_infile:
return opts.infile, opts.outfile
return opts.outfile
def compose_functions(*functions):
return lambda x: functools.reduce(lambda y, f: f(y), functions, x)

View File

@ -1,11 +1,17 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
# Allow direct execution
import os import os
import platform
import sys import sys
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
import platform
from PyInstaller.__main__ import run as run_pyinstaller from PyInstaller.__main__ import run as run_pyinstaller
from devscripts.utils import read_version
OS_NAME, MACHINE, ARCH = sys.platform, platform.machine(), platform.architecture()[0][:2] OS_NAME, MACHINE, ARCH = sys.platform, platform.machine(), platform.architecture()[0][:2]
if MACHINE in ('x86_64', 'AMD64') or ('i' in MACHINE and '86' in MACHINE): if MACHINE in ('x86_64', 'AMD64') or ('i' in MACHINE and '86' in MACHINE):
# NB: Windows x86 has MACHINE = AMD64 irrespective of bitness # NB: Windows x86 has MACHINE = AMD64 irrespective of bitness
@ -13,8 +19,7 @@
def main(): def main():
opts = parse_options() opts, version = parse_options(), read_version()
version = read_version('yt_dlp/version.py')
onedir = '--onedir' in opts or '-D' in opts onedir = '--onedir' in opts or '-D' in opts
if not onedir and '-F' not in opts and '--onefile' not in opts: if not onedir and '-F' not in opts and '--onefile' not in opts:
@ -53,13 +58,6 @@ def parse_options():
return opts return opts
# Get the version from yt_dlp/version.py without importing the package
def read_version(fname):
with open(fname, encoding='utf-8') as f:
exec(compile(f.read(), fname, 'exec'))
return locals()['__version__']
def exe(onedir): def exe(onedir):
"""@returns (name, path)""" """@returns (name, path)"""
name = '_'.join(filter(None, ( name = '_'.join(filter(None, (
@ -83,7 +81,7 @@ def version_to_list(version):
def dependency_options(): def dependency_options():
# Due to the current implementation, these are auto-detected, but explicitly add them just in case # Due to the current implementation, these are auto-detected, but explicitly add them just in case
dependencies = [pycryptodome_module(), 'mutagen', 'brotli', 'certifi', 'websockets'] dependencies = [pycryptodome_module(), 'mutagen', 'brotli', 'certifi', 'websockets']
excluded_modules = ['test', 'ytdlp_plugins', 'youtube_dl', 'youtube_dlc'] excluded_modules = ('youtube_dl', 'youtube_dlc', 'test', 'ytdlp_plugins', 'devscripts')
yield from (f'--hidden-import={module}' for module in dependencies) yield from (f'--hidden-import={module}' for module in dependencies)
yield '--collect-submodules=websockets' yield '--collect-submodules=websockets'

View File

@ -10,6 +10,14 @@ per_file_ignores =
devscripts/lazy_load_template.py: F401 devscripts/lazy_load_template.py: F401
[autoflake]
ignore-init-module-imports = true
ignore-pass-after-docstring = true
remove-all-unused-imports = true
remove-duplicate-keys = true
remove-unused-variables = true
[tool:pytest] [tool:pytest]
addopts = -ra -v --strict-markers addopts = -ra -v --strict-markers
markers = markers =

View File

@ -12,37 +12,26 @@
from distutils.core import Command, setup from distutils.core import Command, setup
setuptools_available = False setuptools_available = False
from devscripts.utils import read_file, read_version
def read(fname): VERSION = read_version()
with open(fname, encoding='utf-8') as f:
return f.read()
# Get the version from yt_dlp/version.py without importing the package
def read_version(fname):
exec(compile(read(fname), fname, 'exec'))
return locals()['__version__']
VERSION = read_version('yt_dlp/version.py')
DESCRIPTION = 'A youtube-dl fork with additional features and patches' DESCRIPTION = 'A youtube-dl fork with additional features and patches'
LONG_DESCRIPTION = '\n\n'.join(( LONG_DESCRIPTION = '\n\n'.join((
'Official repository: <https://github.com/yt-dlp/yt-dlp>', 'Official repository: <https://github.com/yt-dlp/yt-dlp>',
'**PS**: Some links in this document will not work since this is a copy of the README.md from Github', '**PS**: Some links in this document will not work since this is a copy of the README.md from Github',
read('README.md'))) read_file('README.md')))
REQUIREMENTS = read('requirements.txt').splitlines() REQUIREMENTS = read_file('requirements.txt').splitlines()
def packages(): def packages():
if setuptools_available: if setuptools_available:
return find_packages(exclude=('youtube_dl', 'youtube_dlc', 'test', 'ytdlp_plugins')) return find_packages(exclude=('youtube_dl', 'youtube_dlc', 'test', 'ytdlp_plugins', 'devscripts'))
return [ return [
'yt_dlp', 'yt_dlp.extractor', 'yt_dlp.downloader', 'yt_dlp.postprocessor', 'yt_dlp.compat', 'yt_dlp', 'yt_dlp.extractor', 'yt_dlp.downloader', 'yt_dlp.postprocessor', 'yt_dlp.compat',
'yt_dlp.extractor.anvato_token_generator',
] ]
@ -121,7 +110,7 @@ def run(self):
if self.dry_run: if self.dry_run:
print('Skipping build of lazy extractors in dry run mode') print('Skipping build of lazy extractors in dry run mode')
return return
subprocess.run([sys.executable, 'devscripts/make_lazy_extractors.py', 'yt_dlp/extractor/lazy_extractors.py']) subprocess.run([sys.executable, 'devscripts/make_lazy_extractors.py'])
params = py2exe_params() if sys.argv[1:2] == ['py2exe'] else build_params() params = py2exe_params() if sys.argv[1:2] == ['py2exe'] else build_params()

View File

@ -3,11 +3,12 @@ # Supported sites
- **0000studio:clip** - **0000studio:clip**
- **17live** - **17live**
- **17live:clip** - **17live:clip**
- **1News**: 1news.co.nz article videos
- **1tv**: Первый канал - **1tv**: Первый канал
- **20.detik.com**
- **20min** - **20min**
- **23video** - **23video**
- **247sports** - **247sports**
- **24tv.ua**
- **24video** - **24video**
- **3qsdn**: 3Q SDN - **3qsdn**: 3Q SDN
- **3sat** - **3sat**
@ -18,7 +19,7 @@ # Supported sites
- **8tracks** - **8tracks**
- **91porn** - **91porn**
- **9c9media** - **9c9media**
- **9gag** - **9gag**: 9GAG
- **9now.com.au** - **9now.com.au**
- **abc.net.au** - **abc.net.au**
- **abc.net.au:iview** - **abc.net.au:iview**
@ -64,8 +65,8 @@ # Supported sites
- **AmericasTestKitchenSeason** - **AmericasTestKitchenSeason**
- **AmHistoryChannel** - **AmHistoryChannel**
- **anderetijden**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl - **anderetijden**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **Angel**
- **AnimalPlanet** - **AnimalPlanet**
- **AnimeOnDemand**: [<abbr title="netrc machine"><em>animeondemand</em></abbr>]
- **ant1newsgr:article**: ant1news.gr articles - **ant1newsgr:article**: ant1news.gr articles
- **ant1newsgr:embed**: ant1news.gr embedded videos - **ant1newsgr:embed**: ant1news.gr embedded videos
- **ant1newsgr:watch**: ant1news.gr videos - **ant1newsgr:watch**: ant1news.gr videos
@ -127,11 +128,14 @@ # Supported sites
- **bbc.co.uk:iplayer:group** - **bbc.co.uk:iplayer:group**
- **bbc.co.uk:playlist** - **bbc.co.uk:playlist**
- **BBVTV**: [<abbr title="netrc machine"><em>bbvtv</em></abbr>] - **BBVTV**: [<abbr title="netrc machine"><em>bbvtv</em></abbr>]
- **BBVTVLive**: [<abbr title="netrc machine"><em>bbvtv</em></abbr>]
- **BBVTVRecordings**: [<abbr title="netrc machine"><em>bbvtv</em></abbr>]
- **Beatport** - **Beatport**
- **Beeg** - **Beeg**
- **BehindKink** - **BehindKink**
- **Bellator** - **Bellator**
- **BellMedia** - **BellMedia**
- **BerufeTV**
- **Bet** - **Bet**
- **bfi:player** - **bfi:player**
- **bfmtv** - **bfmtv**
@ -145,9 +149,11 @@ # Supported sites
- **Bilibili category extractor** - **Bilibili category extractor**
- **BilibiliAudio** - **BilibiliAudio**
- **BilibiliAudioAlbum** - **BilibiliAudioAlbum**
- **BilibiliChannel**
- **BiliBiliPlayer** - **BiliBiliPlayer**
- **BiliBiliSearch**: Bilibili video search; "bilisearch:" prefix - **BiliBiliSearch**: Bilibili video search; "bilisearch:" prefix
- **BilibiliSpaceAudio**
- **BilibiliSpacePlaylist**
- **BilibiliSpaceVideo**
- **BiliIntl**: [<abbr title="netrc machine"><em>biliintl</em></abbr>] - **BiliIntl**: [<abbr title="netrc machine"><em>biliintl</em></abbr>]
- **BiliIntlSeries**: [<abbr title="netrc machine"><em>biliintl</em></abbr>] - **BiliIntlSeries**: [<abbr title="netrc machine"><em>biliintl</em></abbr>]
- **BiliLive** - **BiliLive**
@ -165,6 +171,7 @@ # Supported sites
- **Bloomberg** - **Bloomberg**
- **BokeCC** - **BokeCC**
- **BongaCams** - **BongaCams**
- **BooyahClips**
- **BostonGlobe** - **BostonGlobe**
- **Box** - **Box**
- **Bpb**: Bundeszentrale für politische Bildung - **Bpb**: Bundeszentrale für politische Bildung
@ -177,6 +184,7 @@ # Supported sites
- **BRMediathek**: Bayerischer Rundfunk Mediathek - **BRMediathek**: Bayerischer Rundfunk Mediathek
- **bt:article**: Bergens Tidende Articles - **bt:article**: Bergens Tidende Articles
- **bt:vestlendingen**: Bergens Tidende - Vestlendingen - **bt:vestlendingen**: Bergens Tidende - Vestlendingen
- **Bundesliga**
- **BusinessInsider** - **BusinessInsider**
- **BuzzFeed** - **BuzzFeed**
- **BYUtv** - **BYUtv**
@ -187,6 +195,7 @@ # Supported sites
- **Camdemy** - **Camdemy**
- **CamdemyFolder** - **CamdemyFolder**
- **CamModels** - **CamModels**
- **CamtasiaEmbed**
- **CamWithHer** - **CamWithHer**
- **CanalAlpha** - **CanalAlpha**
- **canalc2.tv** - **canalc2.tv**
@ -232,6 +241,7 @@ # Supported sites
- **Clippit** - **Clippit**
- **ClipRs** - **ClipRs**
- **Clipsyndicate** - **Clipsyndicate**
- **ClipYouEmbed**
- **CloserToTruth** - **CloserToTruth**
- **CloudflareStream** - **CloudflareStream**
- **Cloudy** - **Cloudy**
@ -243,6 +253,7 @@ # Supported sites
- **CNN** - **CNN**
- **CNNArticle** - **CNNArticle**
- **CNNBlogs** - **CNNBlogs**
- **CNNIndonesia**
- **ComedyCentral** - **ComedyCentral**
- **ComedyCentralTV** - **ComedyCentralTV**
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED - **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
@ -299,6 +310,7 @@ # Supported sites
- **defense.gouv.fr** - **defense.gouv.fr**
- **democracynow** - **democracynow**
- **DestinationAmerica** - **DestinationAmerica**
- **DetikEmbed**
- **DHM**: Filmarchiv - Deutsches Historisches Museum - **DHM**: Filmarchiv - Deutsches Historisches Museum
- **Digg** - **Digg**
- **DigitalConcertHall**: [<abbr title="netrc machine"><em>digitalconcerthall</em></abbr>] DigitalConcertHall extractor - **DigitalConcertHall**: [<abbr title="netrc machine"><em>digitalconcerthall</em></abbr>] DigitalConcertHall extractor
@ -345,6 +357,8 @@ # Supported sites
- **ehftv** - **ehftv**
- **eHow** - **eHow**
- **EinsUndEinsTV**: [<abbr title="netrc machine"><em>1und1tv</em></abbr>] - **EinsUndEinsTV**: [<abbr title="netrc machine"><em>1und1tv</em></abbr>]
- **EinsUndEinsTVLive**: [<abbr title="netrc machine"><em>1und1tv</em></abbr>]
- **EinsUndEinsTVRecordings**: [<abbr title="netrc machine"><em>1und1tv</em></abbr>]
- **Einthusan** - **Einthusan**
- **eitb.tv** - **eitb.tv**
- **EllenTube** - **EllenTube**
@ -357,6 +371,7 @@ # Supported sites
- **Engadget** - **Engadget**
- **Epicon** - **Epicon**
- **EpiconSeries** - **EpiconSeries**
- **Epoch**
- **Eporner** - **Eporner**
- **EroProfile**: [<abbr title="netrc machine"><em>eroprofile</em></abbr>] - **EroProfile**: [<abbr title="netrc machine"><em>eroprofile</em></abbr>]
- **EroProfile:album** - **EroProfile:album**
@ -370,13 +385,17 @@ # Supported sites
- **EsriVideo** - **EsriVideo**
- **Europa** - **Europa**
- **EuropeanTour** - **EuropeanTour**
- **Eurosport**
- **EUScreen** - **EUScreen**
- **EWETV**: [<abbr title="netrc machine"><em>ewetv</em></abbr>] - **EWETV**: [<abbr title="netrc machine"><em>ewetv</em></abbr>]
- **EWETVLive**: [<abbr title="netrc machine"><em>ewetv</em></abbr>]
- **EWETVRecordings**: [<abbr title="netrc machine"><em>ewetv</em></abbr>]
- **ExpoTV** - **ExpoTV**
- **Expressen** - **Expressen**
- **ExtremeTube** - **ExtremeTube**
- **EyedoTV** - **EyedoTV**
- **facebook**: [<abbr title="netrc machine"><em>facebook</em></abbr>] - **facebook**: [<abbr title="netrc machine"><em>facebook</em></abbr>]
- **facebook:reel**
- **FacebookPluginsVideo** - **FacebookPluginsVideo**
- **fancode:live**: [<abbr title="netrc machine"><em>fancode</em></abbr>] - **fancode:live**: [<abbr title="netrc machine"><em>fancode</em></abbr>]
- **fancode:vod**: [<abbr title="netrc machine"><em>fancode</em></abbr>] - **fancode:vod**: [<abbr title="netrc machine"><em>fancode</em></abbr>]
@ -450,6 +469,8 @@ # Supported sites
- **GiantBomb** - **GiantBomb**
- **Giga** - **Giga**
- **GlattvisionTV**: [<abbr title="netrc machine"><em>glattvisiontv</em></abbr>] - **GlattvisionTV**: [<abbr title="netrc machine"><em>glattvisiontv</em></abbr>]
- **GlattvisionTVLive**: [<abbr title="netrc machine"><em>glattvisiontv</em></abbr>]
- **GlattvisionTVRecordings**: [<abbr title="netrc machine"><em>glattvisiontv</em></abbr>]
- **Glide**: Glide mobile video messages (glide.me) - **Glide**: Glide mobile video messages (glide.me)
- **Globo**: [<abbr title="netrc machine"><em>globo</em></abbr>] - **Globo**: [<abbr title="netrc machine"><em>globo</em></abbr>]
- **GloboArticle** - **GloboArticle**
@ -465,6 +486,7 @@ # Supported sites
- **google:podcasts:feed** - **google:podcasts:feed**
- **GoogleDrive** - **GoogleDrive**
- **GoogleDrive:Folder** - **GoogleDrive:Folder**
- **GoPlay**: [<abbr title="netrc machine"><em>goplay</em></abbr>]
- **GoPro** - **GoPro**
- **Goshgay** - **Goshgay**
- **GoToStage** - **GoToStage**
@ -473,6 +495,7 @@ # Supported sites
- **gronkh:feed** - **gronkh:feed**
- **gronkh:vods** - **gronkh:vods**
- **Groupon** - **Groupon**
- **Harpodeon**
- **hbo** - **hbo**
- **HearThisAt** - **HearThisAt**
- **Heise** - **Heise**
@ -491,6 +514,7 @@ # Supported sites
- **hitbox:live** - **hitbox:live**
- **HitRecord** - **HitRecord**
- **hketv**: 香港教育局教育電視 (HKETV) Educational Television, Hong Kong Educational Bureau - **hketv**: 香港教育局教育電視 (HKETV) Educational Television, Hong Kong Educational Bureau
- **Holodex**
- **HotNewHipHop** - **HotNewHipHop**
- **hotstar** - **hotstar**
- **hotstar:playlist** - **hotstar:playlist**
@ -502,6 +526,7 @@ # Supported sites
- **HRTiPlaylist**: [<abbr title="netrc machine"><em>hrti</em></abbr>] - **HRTiPlaylist**: [<abbr title="netrc machine"><em>hrti</em></abbr>]
- **HSEProduct** - **HSEProduct**
- **HSEShow** - **HSEShow**
- **html5**
- **Huajiao**: 花椒直播 - **Huajiao**: 花椒直播
- **HuffPost**: Huffington Post - **HuffPost**: Huffington Post
- **Hungama** - **Hungama**
@ -511,11 +536,14 @@ # Supported sites
- **Hypem** - **Hypem**
- **Hytale** - **Hytale**
- **Icareus** - **Icareus**
- **iflix:episode**
- **IflixSeries**
- **ign.com** - **ign.com**
- **IGNArticle** - **IGNArticle**
- **IGNVideo** - **IGNVideo**
- **IHeartRadio** - **IHeartRadio**
- **iheartradio:podcast** - **iheartradio:podcast**
- **Iltalehti**
- **imdb**: Internet Movie Database trailers - **imdb**: Internet Movie Database trailers
- **imdb:list**: Internet Movie Database lists - **imdb:list**: Internet Movie Database lists
- **Imgur** - **Imgur**
@ -538,6 +566,9 @@ # Supported sites
- **iq.com**: International version of iQiyi - **iq.com**: International version of iQiyi
- **iq.com:album** - **iq.com:album**
- **iqiyi**: [<abbr title="netrc machine"><em>iqiyi</em></abbr>] 爱奇艺 - **iqiyi**: [<abbr title="netrc machine"><em>iqiyi</em></abbr>] 爱奇艺
- **IslamChannel**
- **IslamChannelSeries**
- **IsraelNationalNews**
- **ITProTV** - **ITProTV**
- **ITProTVCourse** - **ITProTVCourse**
- **ITTF** - **ITTF**
@ -573,6 +604,7 @@ # Supported sites
- **KickStarter** - **KickStarter**
- **KinjaEmbed** - **KinjaEmbed**
- **KinoPoisk** - **KinoPoisk**
- **KompasVideo**
- **KonserthusetPlay** - **KonserthusetPlay**
- **Koo** - **Koo**
- **KrasView**: Красвью - **KrasView**: Красвью
@ -669,6 +701,7 @@ # Supported sites
- **Mediasite** - **Mediasite**
- **MediasiteCatalog** - **MediasiteCatalog**
- **MediasiteNamedCatalog** - **MediasiteNamedCatalog**
- **MediaWorksNZVOD**
- **Medici** - **Medici**
- **megaphone.fm**: megaphone.fm embedded players - **megaphone.fm**: megaphone.fm embedded players
- **megatvcom**: megatv.com videos - **megatvcom**: megatv.com videos
@ -681,6 +714,7 @@ # Supported sites
- **mewatch** - **mewatch**
- **Mgoon** - **Mgoon**
- **MiaoPai** - **MiaoPai**
- **MicrosoftEmbed**
- **microsoftstream**: Microsoft Stream - **microsoftstream**: Microsoft Stream
- **mildom**: Record ongoing live by specific user in Mildom - **mildom**: Record ongoing live by specific user in Mildom
- **mildom:clip**: Clip in Mildom - **mildom:clip**: Clip in Mildom
@ -702,10 +736,13 @@ # Supported sites
- **mixcloud:playlist** - **mixcloud:playlist**
- **mixcloud:user** - **mixcloud:user**
- **MLB** - **MLB**
- **MLBTV**: [<abbr title="netrc machine"><em>mlb</em></abbr>]
- **MLBVideo** - **MLBVideo**
- **MLSSoccer** - **MLSSoccer**
- **Mnet** - **Mnet**
- **MNetTV**: [<abbr title="netrc machine"><em>mnettv</em></abbr>] - **MNetTV**: [<abbr title="netrc machine"><em>mnettv</em></abbr>]
- **MNetTVLive**: [<abbr title="netrc machine"><em>mnettv</em></abbr>]
- **MNetTVRecordings**: [<abbr title="netrc machine"><em>mnettv</em></abbr>]
- **MochaVideo** - **MochaVideo**
- **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net - **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net
- **Mofosex** - **Mofosex**
@ -715,9 +752,11 @@ # Supported sites
- **Motherless** - **Motherless**
- **MotherlessGroup** - **MotherlessGroup**
- **Motorsport**: motorsport.com - **Motorsport**: motorsport.com
- **MotorTrend**
- **MovieClips** - **MovieClips**
- **MovieFap** - **MovieFap**
- **Moviepilot** - **Moviepilot**
- **MoviewPlay**
- **Moviezine** - **Moviezine**
- **MovingImage** - **MovingImage**
- **MSN** - **MSN**
@ -775,6 +814,7 @@ # Supported sites
- **NBCSports** - **NBCSports**
- **NBCSportsStream** - **NBCSportsStream**
- **NBCSportsVPlayer** - **NBCSportsVPlayer**
- **NBCStations**
- **ndr**: NDR.de - Norddeutscher Rundfunk - **ndr**: NDR.de - Norddeutscher Rundfunk
- **ndr:embed** - **ndr:embed**
- **ndr:embed:base** - **ndr:embed:base**
@ -790,13 +830,16 @@ # Supported sites
- **netease:program**: 网易云音乐 - 电台节目 - **netease:program**: 网易云音乐 - 电台节目
- **netease:singer**: 网易云音乐 - 歌手 - **netease:singer**: 网易云音乐 - 歌手
- **netease:song**: 网易云音乐 - **netease:song**: 网易云音乐
- **NetPlus**: [<abbr title="netrc machine"><em>netplus</em></abbr>] - **NetPlusTV**: [<abbr title="netrc machine"><em>netplus</em></abbr>]
- **NetPlusTVLive**: [<abbr title="netrc machine"><em>netplus</em></abbr>]
- **NetPlusTVRecordings**: [<abbr title="netrc machine"><em>netplus</em></abbr>]
- **Netverse** - **Netverse**
- **NetversePlaylist** - **NetversePlaylist**
- **Netzkino** - **Netzkino**
- **Newgrounds** - **Newgrounds**
- **Newgrounds:playlist** - **Newgrounds:playlist**
- **Newgrounds:user** - **Newgrounds:user**
- **NewsPicks**
- **Newstube** - **Newstube**
- **Newsy** - **Newsy**
- **NextMedia**: 蘋果日報 - **NextMedia**: 蘋果日報
@ -806,8 +849,8 @@ # Supported sites
- **NexxEmbed** - **NexxEmbed**
- **NFB** - **NFB**
- **NFHSNetwork** - **NFHSNetwork**
- **nfl.com**: (**Currently broken**) - **nfl.com**
- **nfl.com:article**: (**Currently broken**) - **nfl.com:article**
- **NhkForSchoolBangumi** - **NhkForSchoolBangumi**
- **NhkForSchoolProgramList** - **NhkForSchoolProgramList**
- **NhkForSchoolSubject**: Portal page for each school subjects, like Japanese (kokugo, 国語) or math (sansuu/suugaku or 算数・数学) - **NhkForSchoolSubject**: Portal page for each school subjects, like Japanese (kokugo, 国語) or math (sansuu/suugaku or 算数・数学)
@ -890,22 +933,13 @@ # Supported sites
- **openrec:capture** - **openrec:capture**
- **openrec:movie** - **openrec:movie**
- **OraTV** - **OraTV**
- **orf:burgenland**: Radio Burgenland
- **orf:fm4**: radio FM4
- **orf:fm4:story**: fm4.orf.at stories - **orf:fm4:story**: fm4.orf.at stories
- **orf:iptv**: iptv.ORF.at - **orf:iptv**: iptv.ORF.at
- **orf:kaernten**: Radio Kärnten - **orf:radio**
- **orf:noe**: Radio Niederösterreich
- **orf:oberoesterreich**: Radio Oberösterreich
- **orf:oe1**: Radio Österreich 1
- **orf:oe3**: Radio Österreich 3
- **orf:salzburg**: Radio Salzburg
- **orf:steiermark**: Radio Steiermark
- **orf:tirol**: Radio Tirol
- **orf:tvthek**: ORF TVthek - **orf:tvthek**: ORF TVthek
- **orf:vorarlberg**: Radio Vorarlberg
- **orf:wien**: Radio Wien
- **OsnatelTV**: [<abbr title="netrc machine"><em>osnateltv</em></abbr>] - **OsnatelTV**: [<abbr title="netrc machine"><em>osnateltv</em></abbr>]
- **OsnatelTVLive**: [<abbr title="netrc machine"><em>osnateltv</em></abbr>]
- **OsnatelTVRecordings**: [<abbr title="netrc machine"><em>osnateltv</em></abbr>]
- **OutsideTV** - **OutsideTV**
- **PacktPub**: [<abbr title="netrc machine"><em>packtpub</em></abbr>] - **PacktPub**: [<abbr title="netrc machine"><em>packtpub</em></abbr>]
- **PacktPubCourse** - **PacktPubCourse**
@ -919,10 +953,11 @@ # Supported sites
- **ParamountNetwork** - **ParamountNetwork**
- **ParamountPlus** - **ParamountPlus**
- **ParamountPlusSeries** - **ParamountPlusSeries**
- **Parler**: Posts on parler.com
- **parliamentlive.tv**: UK parliament videos - **parliamentlive.tv**: UK parliament videos
- **Parlview** - **Parlview**
- **Patreon** - **Patreon**
- **PatreonUser** - **PatreonCampaign**
- **pbs**: Public Broadcasting Service (PBS) and member stations: PBS: Public Broadcasting Service, APT - Alabama Public Television (WBIQ), GPB/Georgia Public Broadcasting (WGTV), Mississippi Public Broadcasting (WMPN), Nashville Public Television (WNPT), WFSU-TV (WFSU), WSRE (WSRE), WTCI (WTCI), WPBA/Channel 30 (WPBA), Alaska Public Media (KAKM), Arizona PBS (KAET), KNME-TV/Channel 5 (KNME), Vegas PBS (KLVX), AETN/ARKANSAS ETV NETWORK (KETS), KET (WKLE), WKNO/Channel 10 (WKNO), LPB/LOUISIANA PUBLIC BROADCASTING (WLPB), OETA (KETA), Ozarks Public Television (KOZK), WSIU Public Broadcasting (WSIU), KEET TV (KEET), KIXE/Channel 9 (KIXE), KPBS San Diego (KPBS), KQED (KQED), KVIE Public Television (KVIE), PBS SoCal/KOCE (KOCE), ValleyPBS (KVPT), CONNECTICUT PUBLIC TELEVISION (WEDH), KNPB Channel 5 (KNPB), SOPTV (KSYS), Rocky Mountain PBS (KRMA), KENW-TV3 (KENW), KUED Channel 7 (KUED), Wyoming PBS (KCWC), Colorado Public Television / KBDI 12 (KBDI), KBYU-TV (KBYU), Thirteen/WNET New York (WNET), WGBH/Channel 2 (WGBH), WGBY (WGBY), NJTV Public Media NJ (WNJT), WLIW21 (WLIW), mpt/Maryland Public Television (WMPB), WETA Television and Radio (WETA), WHYY (WHYY), PBS 39 (WLVT), WVPT - Your Source for PBS and More! (WVPT), Howard University Television (WHUT), WEDU PBS (WEDU), WGCU Public Media (WGCU), WPBT2 (WPBT), WUCF TV (WUCF), WUFT/Channel 5 (WUFT), WXEL/Channel 42 (WXEL), WLRN/Channel 17 (WLRN), WUSF Public Broadcasting (WUSF), ETV (WRLK), UNC-TV (WUNC), PBS Hawaii - Oceanic Cable Channel 10 (KHET), Idaho Public Television (KAID), KSPS (KSPS), OPB (KOPB), KWSU/Channel 10 & KTNW/Channel 31 (KWSU), WILL-TV (WILL), Network Knowledge - WSEC/Springfield (WSEC), WTTW11 (WTTW), Iowa Public Television/IPTV (KDIN), Nine Network (KETC), PBS39 Fort Wayne (WFWA), WFYI Indianapolis (WFYI), Milwaukee Public Television (WMVS), WNIN (WNIN), WNIT Public Television (WNIT), WPT (WPNE), WVUT/Channel 22 (WVUT), WEIU/Channel 51 (WEIU), WQPT-TV (WQPT), WYCC PBS Chicago (WYCC), WIPB-TV (WIPB), WTIU (WTIU), CET (WCET), ThinkTVNetwork (WPTD), WBGU-TV (WBGU), WGVU TV (WGVU), NET1 (KUON), Pioneer Public Television (KWCM), SDPB Television (KUSD), TPT (KTCA), KSMQ (KSMQ), KPTS/Channel 8 (KPTS), KTWU/Channel 11 (KTWU), East Tennessee PBS (WSJK), WCTE-TV (WCTE), WLJT, Channel 11 (WLJT), WOSU TV (WOSU), WOUB/WOUC (WOUB), WVPB (WVPB), WKYU-PBS (WKYU), KERA 13 (KERA), MPBN (WCBB), Mountain Lake PBS (WCFE), NHPTV (WENH), Vermont PBS (WETK), witf (WITF), WQED Multimedia (WQED), WMHT Educational Telecommunications (WMHT), Q-TV (WDCQ), WTVS Detroit Public TV (WTVS), CMU Public Television (WCMU), WKAR-TV (WKAR), WNMU-TV Public TV 13 (WNMU), WDSE - WRPT (WDSE), WGTE TV (WGTE), Lakeland Public Television (KAWE), KMOS-TV - Channels 6.1, 6.2 and 6.3 (KMOS), MontanaPBS (KUSM), KRWG/Channel 22 (KRWG), KACV (KACV), KCOS/Channel 13 (KCOS), WCNY/Channel 24 (WCNY), WNED (WNED), WPBS (WPBS), WSKG Public TV (WSKG), WXXI (WXXI), WPSU (WPSU), WVIA Public Media Studios (WVIA), WTVI (WTVI), Western Reserve PBS (WNEO), WVIZ/PBS ideastream (WVIZ), KCTS 9 (KCTS), Basin PBS (KPBT), KUHT / Channel 8 (KUHT), KLRN (KLRN), KLRU (KLRU), WTJX Channel 12 (WTJX), WCVE PBS (WCVE), KBTC Public Television (KBTC) - **pbs**: Public Broadcasting Service (PBS) and member stations: PBS: Public Broadcasting Service, APT - Alabama Public Television (WBIQ), GPB/Georgia Public Broadcasting (WGTV), Mississippi Public Broadcasting (WMPN), Nashville Public Television (WNPT), WFSU-TV (WFSU), WSRE (WSRE), WTCI (WTCI), WPBA/Channel 30 (WPBA), Alaska Public Media (KAKM), Arizona PBS (KAET), KNME-TV/Channel 5 (KNME), Vegas PBS (KLVX), AETN/ARKANSAS ETV NETWORK (KETS), KET (WKLE), WKNO/Channel 10 (WKNO), LPB/LOUISIANA PUBLIC BROADCASTING (WLPB), OETA (KETA), Ozarks Public Television (KOZK), WSIU Public Broadcasting (WSIU), KEET TV (KEET), KIXE/Channel 9 (KIXE), KPBS San Diego (KPBS), KQED (KQED), KVIE Public Television (KVIE), PBS SoCal/KOCE (KOCE), ValleyPBS (KVPT), CONNECTICUT PUBLIC TELEVISION (WEDH), KNPB Channel 5 (KNPB), SOPTV (KSYS), Rocky Mountain PBS (KRMA), KENW-TV3 (KENW), KUED Channel 7 (KUED), Wyoming PBS (KCWC), Colorado Public Television / KBDI 12 (KBDI), KBYU-TV (KBYU), Thirteen/WNET New York (WNET), WGBH/Channel 2 (WGBH), WGBY (WGBY), NJTV Public Media NJ (WNJT), WLIW21 (WLIW), mpt/Maryland Public Television (WMPB), WETA Television and Radio (WETA), WHYY (WHYY), PBS 39 (WLVT), WVPT - Your Source for PBS and More! (WVPT), Howard University Television (WHUT), WEDU PBS (WEDU), WGCU Public Media (WGCU), WPBT2 (WPBT), WUCF TV (WUCF), WUFT/Channel 5 (WUFT), WXEL/Channel 42 (WXEL), WLRN/Channel 17 (WLRN), WUSF Public Broadcasting (WUSF), ETV (WRLK), UNC-TV (WUNC), PBS Hawaii - Oceanic Cable Channel 10 (KHET), Idaho Public Television (KAID), KSPS (KSPS), OPB (KOPB), KWSU/Channel 10 & KTNW/Channel 31 (KWSU), WILL-TV (WILL), Network Knowledge - WSEC/Springfield (WSEC), WTTW11 (WTTW), Iowa Public Television/IPTV (KDIN), Nine Network (KETC), PBS39 Fort Wayne (WFWA), WFYI Indianapolis (WFYI), Milwaukee Public Television (WMVS), WNIN (WNIN), WNIT Public Television (WNIT), WPT (WPNE), WVUT/Channel 22 (WVUT), WEIU/Channel 51 (WEIU), WQPT-TV (WQPT), WYCC PBS Chicago (WYCC), WIPB-TV (WIPB), WTIU (WTIU), CET (WCET), ThinkTVNetwork (WPTD), WBGU-TV (WBGU), WGVU TV (WGVU), NET1 (KUON), Pioneer Public Television (KWCM), SDPB Television (KUSD), TPT (KTCA), KSMQ (KSMQ), KPTS/Channel 8 (KPTS), KTWU/Channel 11 (KTWU), East Tennessee PBS (WSJK), WCTE-TV (WCTE), WLJT, Channel 11 (WLJT), WOSU TV (WOSU), WOUB/WOUC (WOUB), WVPB (WVPB), WKYU-PBS (WKYU), KERA 13 (KERA), MPBN (WCBB), Mountain Lake PBS (WCFE), NHPTV (WENH), Vermont PBS (WETK), witf (WITF), WQED Multimedia (WQED), WMHT Educational Telecommunications (WMHT), Q-TV (WDCQ), WTVS Detroit Public TV (WTVS), CMU Public Television (WCMU), WKAR-TV (WKAR), WNMU-TV Public TV 13 (WNMU), WDSE - WRPT (WDSE), WGTE TV (WGTE), Lakeland Public Television (KAWE), KMOS-TV - Channels 6.1, 6.2 and 6.3 (KMOS), MontanaPBS (KUSM), KRWG/Channel 22 (KRWG), KACV (KACV), KCOS/Channel 13 (KCOS), WCNY/Channel 24 (WCNY), WNED (WNED), WPBS (WPBS), WSKG Public TV (WSKG), WXXI (WXXI), WPSU (WPSU), WVIA Public Media Studios (WVIA), WTVI (WTVI), Western Reserve PBS (WNEO), WVIZ/PBS ideastream (WVIZ), KCTS 9 (KCTS), Basin PBS (KPBT), KUHT / Channel 8 (KUHT), KLRN (KLRN), KLRU (KLRU), WTJX Channel 12 (WTJX), WCVE PBS (WCVE), KBTC Public Television (KBTC)
- **PearVideo** - **PearVideo**
- **PeekVids** - **PeekVids**
@ -993,6 +1028,7 @@ # Supported sites
- **PornoVoisines** - **PornoVoisines**
- **PornoXO** - **PornoXO**
- **PornTube** - **PornTube**
- **PrankCast**
- **PremiershipRugby** - **PremiershipRugby**
- **PressTV** - **PressTV**
- **ProjectVeritas** - **ProjectVeritas**
@ -1012,6 +1048,8 @@ # Supported sites
- **qqmusic:singer**: QQ音乐 - 歌手 - **qqmusic:singer**: QQ音乐 - 歌手
- **qqmusic:toplist**: QQ音乐 - 排行榜 - **qqmusic:toplist**: QQ音乐 - 排行榜
- **QuantumTV**: [<abbr title="netrc machine"><em>quantumtv</em></abbr>] - **QuantumTV**: [<abbr title="netrc machine"><em>quantumtv</em></abbr>]
- **QuantumTVLive**: [<abbr title="netrc machine"><em>quantumtv</em></abbr>]
- **QuantumTVRecordings**: [<abbr title="netrc machine"><em>quantumtv</em></abbr>]
- **Qub** - **Qub**
- **R7** - **R7**
- **R7Article** - **R7Article**
@ -1030,12 +1068,14 @@ # Supported sites
- **radlive:channel** - **radlive:channel**
- **radlive:season** - **radlive:season**
- **Rai** - **Rai**
- **RaiNews**
- **RaiPlay** - **RaiPlay**
- **RaiPlayLive** - **RaiPlayLive**
- **RaiPlayPlaylist** - **RaiPlayPlaylist**
- **RaiPlaySound** - **RaiPlaySound**
- **RaiPlaySoundLive** - **RaiPlaySoundLive**
- **RaiPlaySoundPlaylist** - **RaiPlaySoundPlaylist**
- **RaiSudtirol**
- **RayWenderlich** - **RayWenderlich**
- **RayWenderlichCourse** - **RayWenderlichCourse**
- **RBMARadio** - **RBMARadio**
@ -1072,7 +1112,7 @@ # Supported sites
- **RoosterTeethSeries**: [<abbr title="netrc machine"><em>roosterteeth</em></abbr>] - **RoosterTeethSeries**: [<abbr title="netrc machine"><em>roosterteeth</em></abbr>]
- **RottenTomatoes** - **RottenTomatoes**
- **Rozhlas** - **Rozhlas**
- **RTBF** - **RTBF**: [<abbr title="netrc machine"><em>rtbf</em></abbr>]
- **RTDocumentry** - **RTDocumentry**
- **RTDocumentryPlaylist** - **RTDocumentryPlaylist**
- **rte**: Raidió Teilifís Éireann TV - **rte**: Raidió Teilifís Éireann TV
@ -1118,7 +1158,11 @@ # Supported sites
- **safari:course**: [<abbr title="netrc machine"><em>safari</em></abbr>] safaribooksonline.com online courses - **safari:course**: [<abbr title="netrc machine"><em>safari</em></abbr>] safaribooksonline.com online courses
- **Saitosan** - **Saitosan**
- **SAKTV**: [<abbr title="netrc machine"><em>saktv</em></abbr>] - **SAKTV**: [<abbr title="netrc machine"><em>saktv</em></abbr>]
- **SAKTVLive**: [<abbr title="netrc machine"><em>saktv</em></abbr>]
- **SAKTVRecordings**: [<abbr title="netrc machine"><em>saktv</em></abbr>]
- **SaltTV**: [<abbr title="netrc machine"><em>salttv</em></abbr>] - **SaltTV**: [<abbr title="netrc machine"><em>salttv</em></abbr>]
- **SaltTVLive**: [<abbr title="netrc machine"><em>salttv</em></abbr>]
- **SaltTVRecordings**: [<abbr title="netrc machine"><em>salttv</em></abbr>]
- **SampleFocus** - **SampleFocus**
- **Sapo**: SAPO Vídeos - **Sapo**: SAPO Vídeos
- **savefrom.net** - **savefrom.net**
@ -1144,6 +1188,7 @@ # Supported sites
- **Shahid**: [<abbr title="netrc machine"><em>shahid</em></abbr>] - **Shahid**: [<abbr title="netrc machine"><em>shahid</em></abbr>]
- **ShahidShow** - **ShahidShow**
- **Shared**: shared.sx - **Shared**: shared.sx
- **ShareVideosEmbed**
- **ShemarooMe** - **ShemarooMe**
- **ShowRoomLive** - **ShowRoomLive**
- **simplecast** - **simplecast**
@ -1164,6 +1209,7 @@ # Supported sites
- **Slideshare** - **Slideshare**
- **SlidesLive** - **SlidesLive**
- **Slutload** - **Slutload**
- **Smotrim**
- **Snotr** - **Snotr**
- **Sohu** - **Sohu**
- **SonyLIV**: [<abbr title="netrc machine"><em>sonyliv</em></abbr>] - **SonyLIV**: [<abbr title="netrc machine"><em>sonyliv</em></abbr>]
@ -1193,8 +1239,8 @@ # Supported sites
- **Sport5** - **Sport5**
- **SportBox** - **SportBox**
- **SportDeutschland** - **SportDeutschland**
- **spotify**: Spotify episodes - **spotify**: Spotify episodes (**Currently broken**)
- **spotify:show**: Spotify shows - **spotify:show**: Spotify shows (**Currently broken**)
- **Spreaker** - **Spreaker**
- **SpreakerPage** - **SpreakerPage**
- **SpreakerShow** - **SpreakerShow**
@ -1268,6 +1314,7 @@ # Supported sites
- **TeleQuebecVideo** - **TeleQuebecVideo**
- **TeleTask** - **TeleTask**
- **Telewebion** - **Telewebion**
- **Tempo**
- **TennisTV**: [<abbr title="netrc machine"><em>tennistv</em></abbr>] - **TennisTV**: [<abbr title="netrc machine"><em>tennistv</em></abbr>]
- **TenPlay**: [<abbr title="netrc machine"><em>10play</em></abbr>] - **TenPlay**: [<abbr title="netrc machine"><em>10play</em></abbr>]
- **TF1** - **TF1**
@ -1287,10 +1334,10 @@ # Supported sites
- **ThreeSpeak** - **ThreeSpeak**
- **ThreeSpeakUser** - **ThreeSpeakUser**
- **TikTok** - **TikTok**
- **tiktok:effect** - **tiktok:effect**: (**Currently broken**)
- **tiktok:sound** - **tiktok:sound**: (**Currently broken**)
- **tiktok:tag** - **tiktok:tag**: (**Currently broken**)
- **tiktok:user** - **tiktok:user**: (**Currently broken**)
- **tinypic**: tinypic.com videos - **tinypic**: tinypic.com videos
- **TLC** - **TLC**
- **TMZ** - **TMZ**
@ -1306,6 +1353,8 @@ # Supported sites
- **ToypicsUser**: Toypics user profile - **ToypicsUser**: Toypics user profile
- **TrailerAddict**: (**Currently broken**) - **TrailerAddict**: (**Currently broken**)
- **TravelChannel** - **TravelChannel**
- **Triller**: [<abbr title="netrc machine"><em>triller</em></abbr>]
- **TrillerUser**: [<abbr title="netrc machine"><em>triller</em></abbr>]
- **Trilulilu** - **Trilulilu**
- **Trovo** - **Trovo**
- **TrovoChannelClip**: All Clips of a trovo.live channel; "trovoclip:" prefix - **TrovoChannelClip**: All Clips of a trovo.live channel; "trovoclip:" prefix
@ -1313,6 +1362,7 @@ # Supported sites
- **TrovoVod** - **TrovoVod**
- **TrueID** - **TrueID**
- **TruNews** - **TruNews**
- **Truth**
- **TruTV** - **TruTV**
- **Tube8** - **Tube8**
- **TubeTuGraz**: [<abbr title="netrc machine"><em>tubetugraz</em></abbr>] tube.tugraz.at - **TubeTuGraz**: [<abbr title="netrc machine"><em>tubetugraz</em></abbr>] tube.tugraz.at
@ -1328,6 +1378,7 @@ # Supported sites
- **Turbo** - **Turbo**
- **tv.dfb.de** - **tv.dfb.de**
- **TV2** - **TV2**
- **TV24UAGenericPassthrough**
- **TV2Article** - **TV2Article**
- **TV2DK** - **TV2DK**
- **TV2DKBornholmPlay** - **TV2DKBornholmPlay**
@ -1390,6 +1441,7 @@ # Supported sites
- **umg:de**: Universal Music Deutschland - **umg:de**: Universal Music Deutschland
- **Unistra** - **Unistra**
- **Unity** - **Unity**
- **UnscriptedNewsVideo**
- **uol.com.br** - **uol.com.br**
- **uplynk** - **uplynk**
- **uplynk:preplay** - **uplynk:preplay**
@ -1434,8 +1486,6 @@ # Supported sites
- **VidioLive**: [<abbr title="netrc machine"><em>vidio</em></abbr>] - **VidioLive**: [<abbr title="netrc machine"><em>vidio</em></abbr>]
- **VidioPremier**: [<abbr title="netrc machine"><em>vidio</em></abbr>] - **VidioPremier**: [<abbr title="netrc machine"><em>vidio</em></abbr>]
- **VidLii** - **VidLii**
- **vier**: [<abbr title="netrc machine"><em>vier</em></abbr>] vier.be and vijf.be
- **vier:videos**
- **viewlift** - **viewlift**
- **viewlift:embed** - **viewlift:embed**
- **Viidea** - **Viidea**
@ -1480,6 +1530,8 @@ # Supported sites
- **VoxMedia** - **VoxMedia**
- **VoxMediaVolume** - **VoxMediaVolume**
- **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl - **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **vqq:series**
- **vqq:video**
- **Vrak** - **Vrak**
- **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza - **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza
- **VrtNU**: [<abbr title="netrc machine"><em>vrtnu</em></abbr>] VrtNU.be - **VrtNU**: [<abbr title="netrc machine"><em>vrtnu</em></abbr>] VrtNU.be
@ -1488,6 +1540,8 @@ # Supported sites
- **VShare** - **VShare**
- **VTM** - **VTM**
- **VTXTV**: [<abbr title="netrc machine"><em>vtxtv</em></abbr>] - **VTXTV**: [<abbr title="netrc machine"><em>vtxtv</em></abbr>]
- **VTXTVLive**: [<abbr title="netrc machine"><em>vtxtv</em></abbr>]
- **VTXTVRecordings**: [<abbr title="netrc machine"><em>vtxtv</em></abbr>]
- **VuClip** - **VuClip**
- **Vupload** - **Vupload**
- **VVVVID** - **VVVVID**
@ -1497,6 +1551,8 @@ # Supported sites
- **Wakanim** - **Wakanim**
- **Walla** - **Walla**
- **WalyTV**: [<abbr title="netrc machine"><em>walytv</em></abbr>] - **WalyTV**: [<abbr title="netrc machine"><em>walytv</em></abbr>]
- **WalyTVLive**: [<abbr title="netrc machine"><em>walytv</em></abbr>]
- **WalyTVRecordings**: [<abbr title="netrc machine"><em>walytv</em></abbr>]
- **wasdtv:clip** - **wasdtv:clip**
- **wasdtv:record** - **wasdtv:record**
- **wasdtv:stream** - **wasdtv:stream**
@ -1525,8 +1581,10 @@ # Supported sites
- **Willow** - **Willow**
- **WimTV** - **WimTV**
- **Wistia** - **Wistia**
- **WistiaChannel**
- **WistiaPlaylist** - **WistiaPlaylist**
- **wnl**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl - **wnl**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **wordpress:playlist**
- **WorldStarHipHop** - **WorldStarHipHop**
- **wppilot** - **wppilot**
- **wppilot:channels** - **wppilot:channels**
@ -1583,13 +1641,14 @@ # Supported sites
- **youtube:clip** - **youtube:clip**
- **youtube:favorites**: YouTube liked videos; ":ytfav" keyword (requires cookies) - **youtube:favorites**: YouTube liked videos; ":ytfav" keyword (requires cookies)
- **youtube:history**: Youtube watch history; ":ythis" keyword (requires cookies) - **youtube:history**: Youtube watch history; ":ythis" keyword (requires cookies)
- **youtube:music:search_url**: YouTube music search URLs with selectable sections (Eg: #songs) - **youtube:music:search_url**: YouTube music search URLs with selectable sections, e.g. #songs
- **youtube:notif**: YouTube notifications; ":ytnotif" keyword (requires cookies) - **youtube:notif**: YouTube notifications; ":ytnotif" keyword (requires cookies)
- **youtube:playlist**: YouTube playlists - **youtube:playlist**: YouTube playlists
- **youtube:recommended**: YouTube recommended videos; ":ytrec" keyword - **youtube:recommended**: YouTube recommended videos; ":ytrec" keyword
- **youtube:search**: YouTube search; "ytsearch:" prefix - **youtube:search**: YouTube search; "ytsearch:" prefix
- **youtube:search:date**: YouTube search, newest videos first; "ytsearchdate:" prefix - **youtube:search:date**: YouTube search, newest videos first; "ytsearchdate:" prefix
- **youtube:search_url**: YouTube search URLs with sorting and filter support - **youtube:search_url**: YouTube search URLs with sorting and filter support
- **youtube:shorts:pivot:audio**: YouTube Shorts audio pivot (Shorts using audio of a given video)
- **youtube:stories**: YouTube channel stories; "ytstories:" prefix - **youtube:stories**: YouTube channel stories; "ytstories:" prefix
- **youtube:subscriptions**: YouTube subscriptions feed; ":ytsubs" keyword (requires cookies) - **youtube:subscriptions**: YouTube subscriptions feed; ":ytsubs" keyword (requires cookies)
- **youtube:tab**: YouTube Tabs - **youtube:tab**: YouTube Tabs

View File

@ -1567,6 +1567,292 @@ def test_parse_ism_formats(self):
] ]
}, },
), ),
(
'ec-3_test',
'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
[{
'format_id': 'audio_deu_1-224',
'url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'manifest_url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'ext': 'isma',
'tbr': 224,
'asr': 48000,
'vcodec': 'none',
'acodec': 'EC-3',
'protocol': 'ism',
'_download_params':
{
'stream_type': 'audio',
'duration': 370000000,
'timescale': 10000000,
'width': 0,
'height': 0,
'fourcc': 'EC-3',
'language': 'deu',
'codec_private_data': '00063F000000AF87FBA7022DFB42A4D405CD93843BDD0700200F00',
'sampling_rate': 48000,
'channels': 6,
'bits_per_sample': 16,
'nal_unit_length_field': 4
},
'audio_ext': 'isma',
'video_ext': 'none',
'abr': 224,
}, {
'format_id': 'audio_deu-127',
'url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'manifest_url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'ext': 'isma',
'tbr': 127,
'asr': 48000,
'vcodec': 'none',
'acodec': 'AACL',
'protocol': 'ism',
'_download_params':
{
'stream_type': 'audio',
'duration': 370000000,
'timescale': 10000000,
'width': 0,
'height': 0,
'fourcc': 'AACL',
'language': 'deu',
'codec_private_data': '1190',
'sampling_rate': 48000,
'channels': 2,
'bits_per_sample': 16,
'nal_unit_length_field': 4
},
'audio_ext': 'isma',
'video_ext': 'none',
'abr': 127,
}, {
'format_id': 'video_deu-23',
'url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'manifest_url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'ext': 'ismv',
'width': 384,
'height': 216,
'tbr': 23,
'vcodec': 'AVC1',
'acodec': 'none',
'protocol': 'ism',
'_download_params':
{
'stream_type': 'video',
'duration': 370000000,
'timescale': 10000000,
'width': 384,
'height': 216,
'fourcc': 'AVC1',
'language': 'deu',
'codec_private_data': '000000016742C00CDB06077E5C05A808080A00000300020000030009C0C02EE0177CC6300F142AE00000000168CA8DC8',
'channels': 2,
'bits_per_sample': 16,
'nal_unit_length_field': 4
},
'video_ext': 'ismv',
'audio_ext': 'none',
'vbr': 23,
}, {
'format_id': 'video_deu-403',
'url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'manifest_url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'ext': 'ismv',
'width': 400,
'height': 224,
'tbr': 403,
'vcodec': 'AVC1',
'acodec': 'none',
'protocol': 'ism',
'_download_params':
{
'stream_type': 'video',
'duration': 370000000,
'timescale': 10000000,
'width': 400,
'height': 224,
'fourcc': 'AVC1',
'language': 'deu',
'codec_private_data': '00000001674D4014E98323B602D4040405000003000100000300320F1429380000000168EAECF2',
'channels': 2,
'bits_per_sample': 16,
'nal_unit_length_field': 4
},
'video_ext': 'ismv',
'audio_ext': 'none',
'vbr': 403,
}, {
'format_id': 'video_deu-680',
'url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'manifest_url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'ext': 'ismv',
'width': 640,
'height': 360,
'tbr': 680,
'vcodec': 'AVC1',
'acodec': 'none',
'protocol': 'ism',
'_download_params':
{
'stream_type': 'video',
'duration': 370000000,
'timescale': 10000000,
'width': 640,
'height': 360,
'fourcc': 'AVC1',
'language': 'deu',
'codec_private_data': '00000001674D401EE981405FF2E02D4040405000000300100000030320F162D3800000000168EAECF2',
'channels': 2,
'bits_per_sample': 16,
'nal_unit_length_field': 4
},
'video_ext': 'ismv',
'audio_ext': 'none',
'vbr': 680,
}, {
'format_id': 'video_deu-1253',
'url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'manifest_url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'ext': 'ismv',
'width': 640,
'height': 360,
'tbr': 1253,
'vcodec': 'AVC1',
'acodec': 'none',
'protocol': 'ism',
'_download_params':
{
'stream_type': 'video',
'duration': 370000000,
'timescale': 10000000,
'width': 640,
'height': 360,
'fourcc': 'AVC1',
'language': 'deu',
'codec_private_data': '00000001674D401EE981405FF2E02D4040405000000300100000030320F162D3800000000168EAECF2',
'channels': 2,
'bits_per_sample': 16,
'nal_unit_length_field': 4
},
'video_ext': 'ismv',
'audio_ext': 'none',
'vbr': 1253,
}, {
'format_id': 'video_deu-2121',
'url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'manifest_url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'ext': 'ismv',
'width': 768,
'height': 432,
'tbr': 2121,
'vcodec': 'AVC1',
'acodec': 'none',
'protocol': 'ism',
'_download_params':
{
'stream_type': 'video',
'duration': 370000000,
'timescale': 10000000,
'width': 768,
'height': 432,
'fourcc': 'AVC1',
'language': 'deu',
'codec_private_data': '00000001674D401EECA0601BD80B50101014000003000400000300C83C58B6580000000168E93B3C80',
'channels': 2,
'bits_per_sample': 16,
'nal_unit_length_field': 4
},
'video_ext': 'ismv',
'audio_ext': 'none',
'vbr': 2121,
}, {
'format_id': 'video_deu-3275',
'url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'manifest_url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'ext': 'ismv',
'width': 1280,
'height': 720,
'tbr': 3275,
'vcodec': 'AVC1',
'acodec': 'none',
'protocol': 'ism',
'_download_params':
{
'stream_type': 'video',
'duration': 370000000,
'timescale': 10000000,
'width': 1280,
'height': 720,
'fourcc': 'AVC1',
'language': 'deu',
'codec_private_data': '00000001674D4020ECA02802DD80B501010140000003004000000C83C60C65800000000168E93B3C80',
'channels': 2,
'bits_per_sample': 16,
'nal_unit_length_field': 4
},
'video_ext': 'ismv',
'audio_ext': 'none',
'vbr': 3275,
}, {
'format_id': 'video_deu-5300',
'url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'manifest_url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'ext': 'ismv',
'width': 1920,
'height': 1080,
'tbr': 5300,
'vcodec': 'AVC1',
'acodec': 'none',
'protocol': 'ism',
'_download_params':
{
'stream_type': 'video',
'duration': 370000000,
'timescale': 10000000,
'width': 1920,
'height': 1080,
'fourcc': 'AVC1',
'language': 'deu',
'codec_private_data': '00000001674D4028ECA03C0113F2E02D4040405000000300100000030320F18319600000000168E93B3C80',
'channels': 2,
'bits_per_sample': 16,
'nal_unit_length_field': 4
},
'video_ext': 'ismv',
'audio_ext': 'none',
'vbr': 5300,
}, {
'format_id': 'video_deu-8079',
'url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'manifest_url': 'https://smstr01.dmm.t-online.de/smooth24/smoothstream_m1/streaming/sony/9221438342941275747/636887760842957027/25_km_h-Trailer-9221571562372022953_deu_20_1300k_HD_H_264_ISMV.ism/Manifest',
'ext': 'ismv',
'width': 1920,
'height': 1080,
'tbr': 8079,
'vcodec': 'AVC1',
'acodec': 'none',
'protocol': 'ism',
'_download_params':
{
'stream_type': 'video',
'duration': 370000000,
'timescale': 10000000,
'width': 1920,
'height': 1080,
'fourcc': 'AVC1',
'language': 'deu',
'codec_private_data': '00000001674D4028ECA03C0113F2E02D4040405000000300100000030320F18319600000000168E93B3C80',
'channels': 2,
'bits_per_sample': 16,
'nal_unit_length_field': 4
},
'video_ext': 'ismv',
'audio_ext': 'none',
'vbr': 8079,
}],
{},
),
] ]
for ism_file, ism_url, expected_formats, expected_subtitles in _TEST_CASES: for ism_file, ism_url, expected_formats, expected_subtitles in _TEST_CASES:

View File

@ -662,13 +662,17 @@ def test_add_extra_info(self):
'playlist_autonumber': 2, 'playlist_autonumber': 2,
'__last_playlist_index': 100, '__last_playlist_index': 100,
'n_entries': 10, 'n_entries': 10,
'formats': [{'id': 'id 1'}, {'id': 'id 2'}, {'id': 'id 3'}] 'formats': [
{'id': 'id 1', 'height': 1080, 'width': 1920},
{'id': 'id 2', 'height': 720},
{'id': 'id 3'}
]
} }
def test_prepare_outtmpl_and_filename(self): def test_prepare_outtmpl_and_filename(self):
def test(tmpl, expected, *, info=None, **params): def test(tmpl, expected, *, info=None, **params):
params['outtmpl'] = tmpl params['outtmpl'] = tmpl
ydl = YoutubeDL(params) ydl = FakeYDL(params)
ydl._num_downloads = 1 ydl._num_downloads = 1
self.assertEqual(ydl.validate_outtmpl(tmpl), None) self.assertEqual(ydl.validate_outtmpl(tmpl), None)
@ -729,6 +733,7 @@ def test(tmpl, expected, *, info=None, **params):
self.assertTrue(isinstance(YoutubeDL.validate_outtmpl('%(title)'), ValueError)) self.assertTrue(isinstance(YoutubeDL.validate_outtmpl('%(title)'), ValueError))
test('%(invalid@tmpl|def)s', 'none', outtmpl_na_placeholder='none') test('%(invalid@tmpl|def)s', 'none', outtmpl_na_placeholder='none')
test('%(..)s', 'NA') test('%(..)s', 'NA')
test('%(formats.{id)s', 'NA')
# Entire info_dict # Entire info_dict
def expect_same_infodict(out): def expect_same_infodict(out):
@ -813,6 +818,12 @@ def expect_same_infodict(out):
test('%(formats.:2:-1)r', repr(FORMATS[:2:-1])) test('%(formats.:2:-1)r', repr(FORMATS[:2:-1]))
test('%(formats.0.id.-1+id)f', '1235.000000') test('%(formats.0.id.-1+id)f', '1235.000000')
test('%(formats.0.id.-1+formats.1.id.-1)d', '3') test('%(formats.0.id.-1+formats.1.id.-1)d', '3')
out = json.dumps([{'id': f['id'], 'height.:2': str(f['height'])[:2]}
if 'height' in f else {'id': f['id']}
for f in FORMATS])
test('%(formats.:.{id,height.:2})j', (out, sanitize(out)))
test('%(formats.:.{id,height}.id)l', ', '.join(f['id'] for f in FORMATS))
test('%(.{id,title})j', ('{"id": "1234"}', '{id 1234}'))
# Alternates # Alternates
test('%(title,id)s', '1234') test('%(title,id)s', '1234')

View File

@ -3,6 +3,7 @@
from yt_dlp import cookies from yt_dlp import cookies
from yt_dlp.cookies import ( from yt_dlp.cookies import (
LenientSimpleCookie,
LinuxChromeCookieDecryptor, LinuxChromeCookieDecryptor,
MacChromeCookieDecryptor, MacChromeCookieDecryptor,
WindowsChromeCookieDecryptor, WindowsChromeCookieDecryptor,
@ -137,3 +138,163 @@ def test_safari_cookie_parsing(self):
def test_pbkdf2_sha1(self): def test_pbkdf2_sha1(self):
key = pbkdf2_sha1(b'peanuts', b' ' * 16, 1, 16) key = pbkdf2_sha1(b'peanuts', b' ' * 16, 1, 16)
self.assertEqual(key, b'g\xe1\x8e\x0fQ\x1c\x9b\xf3\xc9`!\xaa\x90\xd9\xd34') self.assertEqual(key, b'g\xe1\x8e\x0fQ\x1c\x9b\xf3\xc9`!\xaa\x90\xd9\xd34')
class TestLenientSimpleCookie(unittest.TestCase):
def _run_tests(self, *cases):
for message, raw_cookie, expected in cases:
cookie = LenientSimpleCookie(raw_cookie)
with self.subTest(message, expected=expected):
self.assertEqual(cookie.keys(), expected.keys(), message)
for key, expected_value in expected.items():
morsel = cookie[key]
if isinstance(expected_value, tuple):
expected_value, expected_attributes = expected_value
else:
expected_attributes = {}
attributes = {
key: value
for key, value in dict(morsel).items()
if value != ""
}
self.assertEqual(attributes, expected_attributes, message)
self.assertEqual(morsel.value, expected_value, message)
def test_parsing(self):
self._run_tests(
# Copied from https://github.com/python/cpython/blob/v3.10.7/Lib/test/test_http_cookies.py
(
"Test basic cookie",
"chips=ahoy; vienna=finger",
{"chips": "ahoy", "vienna": "finger"},
),
(
"Test quoted cookie",
'keebler="E=mc2; L=\\"Loves\\"; fudge=\\012;"',
{"keebler": 'E=mc2; L="Loves"; fudge=\012;'},
),
(
"Allow '=' in an unquoted value",
"keebler=E=mc2",
{"keebler": "E=mc2"},
),
(
"Allow cookies with ':' in their name",
"key:term=value:term",
{"key:term": "value:term"},
),
(
"Allow '[' and ']' in cookie values",
"a=b; c=[; d=r; f=h",
{"a": "b", "c": "[", "d": "r", "f": "h"},
),
(
"Test basic cookie attributes",
'Customer="WILE_E_COYOTE"; Version=1; Path=/acme',
{"Customer": ("WILE_E_COYOTE", {"version": "1", "path": "/acme"})},
),
(
"Test flag only cookie attributes",
'Customer="WILE_E_COYOTE"; HttpOnly; Secure',
{"Customer": ("WILE_E_COYOTE", {"httponly": True, "secure": True})},
),
(
"Test flag only attribute with values",
"eggs=scrambled; httponly=foo; secure=bar; Path=/bacon",
{"eggs": ("scrambled", {"httponly": "foo", "secure": "bar", "path": "/bacon"})},
),
(
"Test special case for 'expires' attribute, 4 digit year",
'Customer="W"; expires=Wed, 01 Jan 2010 00:00:00 GMT',
{"Customer": ("W", {"expires": "Wed, 01 Jan 2010 00:00:00 GMT"})},
),
(
"Test special case for 'expires' attribute, 2 digit year",
'Customer="W"; expires=Wed, 01 Jan 98 00:00:00 GMT',
{"Customer": ("W", {"expires": "Wed, 01 Jan 98 00:00:00 GMT"})},
),
(
"Test extra spaces in keys and values",
"eggs = scrambled ; secure ; path = bar ; foo=foo ",
{"eggs": ("scrambled", {"secure": True, "path": "bar"}), "foo": "foo"},
),
(
"Test quoted attributes",
'Customer="WILE_E_COYOTE"; Version="1"; Path="/acme"',
{"Customer": ("WILE_E_COYOTE", {"version": "1", "path": "/acme"})}
),
# Our own tests that CPython passes
(
"Allow ';' in quoted value",
'chips="a;hoy"; vienna=finger',
{"chips": "a;hoy", "vienna": "finger"},
),
(
"Keep only the last set value",
"a=c; a=b",
{"a": "b"},
),
)
def test_lenient_parsing(self):
self._run_tests(
(
"Ignore and try to skip invalid cookies",
'chips={"ahoy;": 1}; vienna="finger;"',
{"vienna": "finger;"},
),
(
"Ignore cookies without a name",
"a=b; unnamed; c=d",
{"a": "b", "c": "d"},
),
(
"Ignore '\"' cookie without name",
'a=b; "; c=d',
{"a": "b", "c": "d"},
),
(
"Skip all space separated values",
"x a=b c=d x; e=f",
{"a": "b", "c": "d", "e": "f"},
),
(
"Skip all space separated values",
'x a=b; data={"complex": "json", "with": "key=value"}; x c=d x',
{"a": "b", "c": "d"},
),
(
"Expect quote mending",
'a=b; invalid="; c=d',
{"a": "b", "c": "d"},
),
(
"Reset morsel after invalid to not capture attributes",
"a=b; invalid; Version=1; c=d",
{"a": "b", "c": "d"},
),
(
"Reset morsel after invalid to not capture attributes",
"a=b; $invalid; $Version=1; c=d",
{"a": "b", "c": "d"},
),
(
"Continue after non-flag attribute without value",
"a=b; path; Version=1; c=d",
{"a": "b", "c": "d"},
),
(
"Allow cookie attributes with `$` prefix",
'Customer="WILE_E_COYOTE"; $Version=1; $Secure; $Path=/acme',
{"Customer": ("WILE_E_COYOTE", {"version": "1", "secure": True, "path": "/acme"})},
),
(
"Invalid Morsel keys should not result in an error",
"Key=Value; [Invalid]=Value; Another=Value",
{"Key": "Value", "Another": "Value"},
),
)

View File

@ -105,11 +105,11 @@ def print_skipping(reason):
info_dict = tc.get('info_dict', {}) info_dict = tc.get('info_dict', {})
params = tc.get('params', {}) params = tc.get('params', {})
if not info_dict.get('id'): if not info_dict.get('id'):
raise Exception('Test definition incorrect. \'id\' key is not present') raise Exception(f'Test {tname} definition incorrect - "id" key is not present')
elif not info_dict.get('ext'): elif not info_dict.get('ext'):
if params.get('skip_download') and params.get('ignore_no_formats_error'): if params.get('skip_download') and params.get('ignore_no_formats_error'):
continue continue
raise Exception('Test definition incorrect. The output file cannot be known. \'ext\' key is not present') raise Exception(f'Test {tname} definition incorrect - "ext" key must be present to define the output file')
if 'skip' in test_case: if 'skip' in test_case:
print_skipping(test_case['skip']) print_skipping(test_case['skip'])
@ -161,7 +161,9 @@ def try_rm_tcs_files(tcs=None):
force_generic_extractor=params.get('force_generic_extractor', False)) force_generic_extractor=params.get('force_generic_extractor', False))
except (DownloadError, ExtractorError) as err: except (DownloadError, ExtractorError) as err:
# Check if the exception is not a network related one # Check if the exception is not a network related one
if not err.exc_info[0] in (urllib.error.URLError, socket.timeout, UnavailableVideoError, http.client.BadStatusLine) or (err.exc_info[0] == urllib.error.HTTPError and err.exc_info[1].code == 503): if (err.exc_info[0] not in (urllib.error.URLError, socket.timeout, UnavailableVideoError, http.client.BadStatusLine)
or (err.exc_info[0] == urllib.error.HTTPError and err.exc_info[1].code == 503)):
err.msg = f'{getattr(err, "msg", err)} ({tname})'
raise raise
if try_num == RETRIES: if try_num == RETRIES:

View File

@ -11,41 +11,46 @@
import contextlib import contextlib
import subprocess import subprocess
from yt_dlp.utils import encodeArgument from yt_dlp.utils import Popen
rootDir = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) rootDir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
LAZY_EXTRACTORS = 'yt_dlp/extractor/lazy_extractors.py'
try:
_DEV_NULL = subprocess.DEVNULL
except AttributeError:
_DEV_NULL = open(os.devnull, 'wb')
class TestExecution(unittest.TestCase): class TestExecution(unittest.TestCase):
def test_import(self): def run_yt_dlp(self, exe=(sys.executable, 'yt_dlp/__main__.py'), opts=('--version', )):
subprocess.check_call([sys.executable, '-c', 'import yt_dlp'], cwd=rootDir) stdout, stderr, returncode = Popen.run(
[*exe, '--ignore-config', *opts], cwd=rootDir, text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
def test_module_exec(self): print(stderr, file=sys.stderr)
subprocess.check_call([sys.executable, '-m', 'yt_dlp', '--ignore-config', '--version'], cwd=rootDir, stdout=_DEV_NULL) self.assertEqual(returncode, 0)
return stdout.strip(), stderr.strip()
def test_main_exec(self): def test_main_exec(self):
subprocess.check_call([sys.executable, 'yt_dlp/__main__.py', '--ignore-config', '--version'], cwd=rootDir, stdout=_DEV_NULL) self.run_yt_dlp()
def test_import(self):
self.run_yt_dlp(exe=(sys.executable, '-c', 'import yt_dlp'))
def test_module_exec(self):
self.run_yt_dlp(exe=(sys.executable, '-m', 'yt_dlp'))
def test_cmdline_umlauts(self): def test_cmdline_umlauts(self):
p = subprocess.Popen( _, stderr = self.run_yt_dlp(opts=('ä', '--version'))
[sys.executable, 'yt_dlp/__main__.py', '--ignore-config', encodeArgument('ä'), '--version'],
cwd=rootDir, stdout=_DEV_NULL, stderr=subprocess.PIPE)
_, stderr = p.communicate()
self.assertFalse(stderr) self.assertFalse(stderr)
def test_lazy_extractors(self): def test_lazy_extractors(self):
try: try:
subprocess.check_call([sys.executable, 'devscripts/make_lazy_extractors.py', 'yt_dlp/extractor/lazy_extractors.py'], cwd=rootDir, stdout=_DEV_NULL) subprocess.check_call([sys.executable, 'devscripts/make_lazy_extractors.py', LAZY_EXTRACTORS],
subprocess.check_call([sys.executable, 'test/test_all_urls.py'], cwd=rootDir, stdout=_DEV_NULL) cwd=rootDir, stdout=subprocess.DEVNULL)
self.assertTrue(os.path.exists(LAZY_EXTRACTORS))
_, stderr = self.run_yt_dlp(opts=('-s', 'test:'))
self.assertFalse(stderr)
subprocess.check_call([sys.executable, 'test/test_all_urls.py'], cwd=rootDir, stdout=subprocess.DEVNULL)
finally: finally:
with contextlib.suppress(OSError): with contextlib.suppress(OSError):
os.remove('yt_dlp/extractor/lazy_extractors.py') os.remove(LAZY_EXTRACTORS)
if __name__ == '__main__': if __name__ == '__main__':

View File

@ -7,8 +7,10 @@
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import math
import re
from yt_dlp.jsinterp import JSInterpreter from yt_dlp.jsinterp import JS_Undefined, JSInterpreter
class TestJSInterpreter(unittest.TestCase): class TestJSInterpreter(unittest.TestCase):
@ -19,6 +21,9 @@ def test_basic(self):
jsi = JSInterpreter('function x3(){return 42;}') jsi = JSInterpreter('function x3(){return 42;}')
self.assertEqual(jsi.call_function('x3'), 42) self.assertEqual(jsi.call_function('x3'), 42)
jsi = JSInterpreter('function x3(){42}')
self.assertEqual(jsi.call_function('x3'), None)
jsi = JSInterpreter('var x5 = function(){return 42;}') jsi = JSInterpreter('var x5 = function(){return 42;}')
self.assertEqual(jsi.call_function('x5'), 42) self.assertEqual(jsi.call_function('x5'), 42)
@ -45,14 +50,32 @@ def test_operators(self):
jsi = JSInterpreter('function f(){return 1 << 5;}') jsi = JSInterpreter('function f(){return 1 << 5;}')
self.assertEqual(jsi.call_function('f'), 32) self.assertEqual(jsi.call_function('f'), 32)
jsi = JSInterpreter('function f(){return 2 ** 5}')
self.assertEqual(jsi.call_function('f'), 32)
jsi = JSInterpreter('function f(){return 19 & 21;}') jsi = JSInterpreter('function f(){return 19 & 21;}')
self.assertEqual(jsi.call_function('f'), 17) self.assertEqual(jsi.call_function('f'), 17)
jsi = JSInterpreter('function f(){return 11 >> 2;}') jsi = JSInterpreter('function f(){return 11 >> 2;}')
self.assertEqual(jsi.call_function('f'), 2) self.assertEqual(jsi.call_function('f'), 2)
jsi = JSInterpreter('function f(){return []? 2+3: 4;}')
self.assertEqual(jsi.call_function('f'), 5)
jsi = JSInterpreter('function f(){return 1 == 2}')
self.assertEqual(jsi.call_function('f'), False)
jsi = JSInterpreter('function f(){return 0 && 1 || 2;}')
self.assertEqual(jsi.call_function('f'), 2)
jsi = JSInterpreter('function f(){return 0 ?? 42;}')
self.assertEqual(jsi.call_function('f'), 0)
jsi = JSInterpreter('function f(){return "life, the universe and everything" < 42;}')
self.assertFalse(jsi.call_function('f'))
def test_array_access(self): def test_array_access(self):
jsi = JSInterpreter('function f(){var x = [1,2,3]; x[0] = 4; x[0] = 5; x[2] = 7; return x;}') jsi = JSInterpreter('function f(){var x = [1,2,3]; x[0] = 4; x[0] = 5; x[2.0] = 7; return x;}')
self.assertEqual(jsi.call_function('f'), [5, 2, 7]) self.assertEqual(jsi.call_function('f'), [5, 2, 7])
def test_parens(self): def test_parens(self):
@ -62,6 +85,10 @@ def test_parens(self):
jsi = JSInterpreter('function f(){return (1 + 2) * 3;}') jsi = JSInterpreter('function f(){return (1 + 2) * 3;}')
self.assertEqual(jsi.call_function('f'), 9) self.assertEqual(jsi.call_function('f'), 9)
def test_quotes(self):
jsi = JSInterpreter(R'function f(){return "a\"\\("}')
self.assertEqual(jsi.call_function('f'), R'a"\(')
def test_assignments(self): def test_assignments(self):
jsi = JSInterpreter('function f(){var x = 20; x = 30 + 1; return x;}') jsi = JSInterpreter('function f(){var x = 20; x = 30 + 1; return x;}')
self.assertEqual(jsi.call_function('f'), 31) self.assertEqual(jsi.call_function('f'), 31)
@ -104,17 +131,33 @@ def test_precedence(self):
}''') }''')
self.assertEqual(jsi.call_function('x'), [20, 20, 30, 40, 50]) self.assertEqual(jsi.call_function('x'), [20, 20, 30, 40, 50])
def test_builtins(self):
jsi = JSInterpreter('''
function x() { return NaN }
''')
self.assertTrue(math.isnan(jsi.call_function('x')))
jsi = JSInterpreter('''
function x() { return new Date('Wednesday 31 December 1969 18:01:26 MDT') - 0; }
''')
self.assertEqual(jsi.call_function('x'), 86000)
jsi = JSInterpreter('''
function x(dt) { return new Date(dt) - 0; }
''')
self.assertEqual(jsi.call_function('x', 'Wednesday 31 December 1969 18:01:26 MDT'), 86000)
def test_call(self): def test_call(self):
jsi = JSInterpreter(''' jsi = JSInterpreter('''
function x() { return 2; } function x() { return 2; }
function y(a) { return x() + a; } function y(a) { return x() + (a?a:0); }
function z() { return y(3); } function z() { return y(3); }
''') ''')
self.assertEqual(jsi.call_function('z'), 5) self.assertEqual(jsi.call_function('z'), 5)
self.assertEqual(jsi.call_function('y'), 2)
def test_for_loop(self): def test_for_loop(self):
jsi = JSInterpreter(''' jsi = JSInterpreter('''
function x() { a=0; for (i=0; i-10; i++) {a++} a } function x() { a=0; for (i=0; i-10; i++) {a++} return a }
''') ''')
self.assertEqual(jsi.call_function('x'), 10) self.assertEqual(jsi.call_function('x'), 10)
@ -153,21 +196,53 @@ def test_try(self):
''') ''')
self.assertEqual(jsi.call_function('x'), 10) self.assertEqual(jsi.call_function('x'), 10)
def test_catch(self):
jsi = JSInterpreter('''
function x() { try{throw 10} catch(e){return 5} }
''')
self.assertEqual(jsi.call_function('x'), 5)
def test_finally(self):
jsi = JSInterpreter('''
function x() { try{throw 10} finally {return 42} }
''')
self.assertEqual(jsi.call_function('x'), 42)
jsi = JSInterpreter('''
function x() { try{throw 10} catch(e){return 5} finally {return 42} }
''')
self.assertEqual(jsi.call_function('x'), 42)
def test_nested_try(self):
jsi = JSInterpreter('''
function x() {try {
try{throw 10} finally {throw 42}
} catch(e){return 5} }
''')
self.assertEqual(jsi.call_function('x'), 5)
def test_for_loop_continue(self): def test_for_loop_continue(self):
jsi = JSInterpreter(''' jsi = JSInterpreter('''
function x() { a=0; for (i=0; i-10; i++) { continue; a++ } a } function x() { a=0; for (i=0; i-10; i++) { continue; a++ } return a }
''') ''')
self.assertEqual(jsi.call_function('x'), 0) self.assertEqual(jsi.call_function('x'), 0)
def test_for_loop_break(self): def test_for_loop_break(self):
jsi = JSInterpreter(''' jsi = JSInterpreter('''
function x() { a=0; for (i=0; i-10; i++) { break; a++ } a } function x() { a=0; for (i=0; i-10; i++) { break; a++ } return a }
''') ''')
self.assertEqual(jsi.call_function('x'), 0) self.assertEqual(jsi.call_function('x'), 0)
def test_for_loop_try(self):
jsi = JSInterpreter('''
function x() {
for (i=0; i-10; i++) { try { if (i == 5) throw i} catch {return 10} finally {break} };
return 42 }
''')
self.assertEqual(jsi.call_function('x'), 42)
def test_literal_list(self): def test_literal_list(self):
jsi = JSInterpreter(''' jsi = JSInterpreter('''
function x() { [1, 2, "asdf", [5, 6, 7]][3] } function x() { return [1, 2, "asdf", [5, 6, 7]][3] }
''') ''')
self.assertEqual(jsi.call_function('x'), [5, 6, 7]) self.assertEqual(jsi.call_function('x'), [5, 6, 7])
@ -177,6 +252,167 @@ def test_comma(self):
''') ''')
self.assertEqual(jsi.call_function('x'), 7) self.assertEqual(jsi.call_function('x'), 7)
jsi = JSInterpreter('''
function x() { a=5; return (a -= 1, a+=3, a); }
''')
self.assertEqual(jsi.call_function('x'), 7)
jsi = JSInterpreter('''
function x() { return (l=[0,1,2,3], function(a, b){return a+b})((l[1], l[2]), l[3]) }
''')
self.assertEqual(jsi.call_function('x'), 5)
def test_void(self):
jsi = JSInterpreter('''
function x() { return void 42; }
''')
self.assertEqual(jsi.call_function('x'), None)
def test_return_function(self):
jsi = JSInterpreter('''
function x() { return [1, function(){return 1}][1] }
''')
self.assertEqual(jsi.call_function('x')([]), 1)
def test_null(self):
jsi = JSInterpreter('''
function x() { return null; }
''')
self.assertEqual(jsi.call_function('x'), None)
jsi = JSInterpreter('''
function x() { return [null > 0, null < 0, null == 0, null === 0]; }
''')
self.assertEqual(jsi.call_function('x'), [False, False, False, False])
jsi = JSInterpreter('''
function x() { return [null >= 0, null <= 0]; }
''')
self.assertEqual(jsi.call_function('x'), [True, True])
def test_undefined(self):
jsi = JSInterpreter('''
function x() { return undefined === undefined; }
''')
self.assertEqual(jsi.call_function('x'), True)
jsi = JSInterpreter('''
function x() { return undefined; }
''')
self.assertEqual(jsi.call_function('x'), JS_Undefined)
jsi = JSInterpreter('''
function x() { let v; return v; }
''')
self.assertEqual(jsi.call_function('x'), JS_Undefined)
jsi = JSInterpreter('''
function x() { return [undefined === undefined, undefined == undefined, undefined < undefined, undefined > undefined]; }
''')
self.assertEqual(jsi.call_function('x'), [True, True, False, False])
jsi = JSInterpreter('''
function x() { return [undefined === 0, undefined == 0, undefined < 0, undefined > 0]; }
''')
self.assertEqual(jsi.call_function('x'), [False, False, False, False])
jsi = JSInterpreter('''
function x() { return [undefined >= 0, undefined <= 0]; }
''')
self.assertEqual(jsi.call_function('x'), [False, False])
jsi = JSInterpreter('''
function x() { return [undefined > null, undefined < null, undefined == null, undefined === null]; }
''')
self.assertEqual(jsi.call_function('x'), [False, False, True, False])
jsi = JSInterpreter('''
function x() { return [undefined === null, undefined == null, undefined < null, undefined > null]; }
''')
self.assertEqual(jsi.call_function('x'), [False, True, False, False])
jsi = JSInterpreter('''
function x() { let v; return [42+v, v+42, v**42, 42**v, 0**v]; }
''')
for y in jsi.call_function('x'):
self.assertTrue(math.isnan(y))
jsi = JSInterpreter('''
function x() { let v; return v**0; }
''')
self.assertEqual(jsi.call_function('x'), 1)
jsi = JSInterpreter('''
function x() { let v; return [v>42, v<=42, v&&42, 42&&v]; }
''')
self.assertEqual(jsi.call_function('x'), [False, False, JS_Undefined, JS_Undefined])
jsi = JSInterpreter('function x(){return undefined ?? 42; }')
self.assertEqual(jsi.call_function('x'), 42)
def test_object(self):
jsi = JSInterpreter('''
function x() { return {}; }
''')
self.assertEqual(jsi.call_function('x'), {})
jsi = JSInterpreter('''
function x() { let a = {m1: 42, m2: 0 }; return [a["m1"], a.m2]; }
''')
self.assertEqual(jsi.call_function('x'), [42, 0])
jsi = JSInterpreter('''
function x() { let a; return a?.qq; }
''')
self.assertEqual(jsi.call_function('x'), JS_Undefined)
jsi = JSInterpreter('''
function x() { let a = {m1: 42, m2: 0 }; return a?.qq; }
''')
self.assertEqual(jsi.call_function('x'), JS_Undefined)
def test_regex(self):
jsi = JSInterpreter('''
function x() { let a=/,,[/,913,/](,)}/; }
''')
self.assertEqual(jsi.call_function('x'), None)
jsi = JSInterpreter('''
function x() { let a=/,,[/,913,/](,)}/; return a; }
''')
self.assertIsInstance(jsi.call_function('x'), re.Pattern)
jsi = JSInterpreter('''
function x() { let a=/,,[/,913,/](,)}/i; return a; }
''')
self.assertEqual(jsi.call_function('x').flags & re.I, re.I)
jsi = JSInterpreter(R'''
function x() { let a=/,][}",],()}(\[)/; return a; }
''')
self.assertEqual(jsi.call_function('x').pattern, r',][}",],()}(\[)')
jsi = JSInterpreter(R'''
function x() { let a=[/[)\\]/]; return a[0]; }
''')
self.assertEqual(jsi.call_function('x').pattern, r'[)\\]')
def test_char_code_at(self):
jsi = JSInterpreter('function x(i){return "test".charCodeAt(i)}')
self.assertEqual(jsi.call_function('x', 0), 116)
self.assertEqual(jsi.call_function('x', 1), 101)
self.assertEqual(jsi.call_function('x', 2), 115)
self.assertEqual(jsi.call_function('x', 3), 116)
self.assertEqual(jsi.call_function('x', 4), None)
self.assertEqual(jsi.call_function('x', 'not_a_number'), 116)
def test_bitwise_operators_overflow(self):
jsi = JSInterpreter('function x(){return -524999584 << 5}')
self.assertEqual(jsi.call_function('x'), 379882496)
jsi = JSInterpreter('function x(){return 1236566549 << 5}')
self.assertEqual(jsi.call_function('x'), 915423904)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@ -16,6 +16,7 @@
MetadataFromFieldPP, MetadataFromFieldPP,
MetadataParserPP, MetadataParserPP,
ModifyChaptersPP, ModifyChaptersPP,
SponsorBlockPP,
) )
@ -76,11 +77,15 @@ def setUp(self):
self._pp = ModifyChaptersPP(YoutubeDL()) self._pp = ModifyChaptersPP(YoutubeDL())
@staticmethod @staticmethod
def _sponsor_chapter(start, end, cat, remove=False): def _sponsor_chapter(start, end, cat, remove=False, title=None):
c = {'start_time': start, 'end_time': end, '_categories': [(cat, start, end)]} if title is None:
if remove: title = SponsorBlockPP.CATEGORIES[cat]
c['remove'] = True return {
return c 'start_time': start,
'end_time': end,
'_categories': [(cat, start, end, title)],
**({'remove': True} if remove else {}),
}
@staticmethod @staticmethod
def _chapter(start, end, title=None, remove=False): def _chapter(start, end, title=None, remove=False):
@ -130,6 +135,19 @@ def test_remove_marked_arrange_sponsors_ChapterWithSponsors(self):
'c', '[SponsorBlock]: Filler Tangent', 'c']) 'c', '[SponsorBlock]: Filler Tangent', 'c'])
self._remove_marked_arrange_sponsors_test_impl(chapters, expected, []) self._remove_marked_arrange_sponsors_test_impl(chapters, expected, [])
def test_remove_marked_arrange_sponsors_SponsorBlockChapters(self):
chapters = self._chapters([70], ['c']) + [
self._sponsor_chapter(10, 20, 'chapter', title='sb c1'),
self._sponsor_chapter(15, 16, 'chapter', title='sb c2'),
self._sponsor_chapter(30, 40, 'preview'),
self._sponsor_chapter(50, 60, 'filler')]
expected = self._chapters(
[10, 15, 16, 20, 30, 40, 50, 60, 70],
['c', '[SponsorBlock]: sb c1', '[SponsorBlock]: sb c1, sb c2', '[SponsorBlock]: sb c1',
'c', '[SponsorBlock]: Preview/Recap',
'c', '[SponsorBlock]: Filler Tangent', 'c'])
self._remove_marked_arrange_sponsors_test_impl(chapters, expected, [])
def test_remove_marked_arrange_sponsors_UniqueNamesForOverlappingSponsors(self): def test_remove_marked_arrange_sponsors_UniqueNamesForOverlappingSponsors(self):
chapters = self._chapters([120], ['c']) + [ chapters = self._chapters([120], ['c']) + [
self._sponsor_chapter(10, 45, 'sponsor'), self._sponsor_chapter(20, 40, 'selfpromo'), self._sponsor_chapter(10, 45, 'sponsor'), self._sponsor_chapter(20, 40, 'selfpromo'),
@ -173,7 +191,7 @@ def test_remove_marked_arrange_sponsors_ChapterWithSponsorCutInTheMiddle(self):
self._remove_marked_arrange_sponsors_test_impl(chapters, expected, cuts) self._remove_marked_arrange_sponsors_test_impl(chapters, expected, cuts)
def test_remove_marked_arrange_sponsors_ChapterWithCutHidingSponsor(self): def test_remove_marked_arrange_sponsors_ChapterWithCutHidingSponsor(self):
cuts = [self._sponsor_chapter(20, 50, 'selpromo', remove=True)] cuts = [self._sponsor_chapter(20, 50, 'selfpromo', remove=True)]
chapters = self._chapters([60], ['c']) + [ chapters = self._chapters([60], ['c']) + [
self._sponsor_chapter(10, 20, 'intro'), self._sponsor_chapter(10, 20, 'intro'),
self._sponsor_chapter(30, 40, 'sponsor'), self._sponsor_chapter(30, 40, 'sponsor'),
@ -199,7 +217,7 @@ def test_remove_marked_arrange_sponsors_ChapterWithAdjacentCuts(self):
self._sponsor_chapter(10, 20, 'sponsor'), self._sponsor_chapter(10, 20, 'sponsor'),
self._sponsor_chapter(20, 30, 'interaction', remove=True), self._sponsor_chapter(20, 30, 'interaction', remove=True),
self._chapter(30, 40, remove=True), self._chapter(30, 40, remove=True),
self._sponsor_chapter(40, 50, 'selpromo', remove=True), self._sponsor_chapter(40, 50, 'selfpromo', remove=True),
self._sponsor_chapter(50, 60, 'interaction')] self._sponsor_chapter(50, 60, 'interaction')]
expected = self._chapters([10, 20, 30, 40], expected = self._chapters([10, 20, 30, 40],
['c', '[SponsorBlock]: Sponsor', ['c', '[SponsorBlock]: Sponsor',
@ -282,7 +300,7 @@ def test_remove_marked_arrange_sponsors_SponsorsNoLongerOverlapAfterCut(self):
chapters = self._chapters([70], ['c']) + [ chapters = self._chapters([70], ['c']) + [
self._sponsor_chapter(10, 30, 'sponsor'), self._sponsor_chapter(10, 30, 'sponsor'),
self._sponsor_chapter(20, 50, 'interaction'), self._sponsor_chapter(20, 50, 'interaction'),
self._sponsor_chapter(30, 50, 'selpromo', remove=True), self._sponsor_chapter(30, 50, 'selfpromo', remove=True),
self._sponsor_chapter(40, 60, 'sponsor'), self._sponsor_chapter(40, 60, 'sponsor'),
self._sponsor_chapter(50, 60, 'interaction')] self._sponsor_chapter(50, 60, 'interaction')]
expected = self._chapters( expected = self._chapters(

View File

@ -2,6 +2,7 @@
# Allow direct execution # Allow direct execution
import os import os
import re
import sys import sys
import unittest import unittest
@ -109,6 +110,7 @@
strip_or_none, strip_or_none,
subtitles_filename, subtitles_filename,
timeconvert, timeconvert,
traverse_obj,
unescapeHTML, unescapeHTML,
unified_strdate, unified_strdate,
unified_timestamp, unified_timestamp,
@ -413,6 +415,10 @@ def test_unified_timestamps(self):
self.assertEqual(unified_timestamp('December 15, 2017 at 7:49 am'), 1513324140) self.assertEqual(unified_timestamp('December 15, 2017 at 7:49 am'), 1513324140)
self.assertEqual(unified_timestamp('2018-03-14T08:32:43.1493874+00:00'), 1521016363) self.assertEqual(unified_timestamp('2018-03-14T08:32:43.1493874+00:00'), 1521016363)
self.assertEqual(unified_timestamp('December 31 1969 20:00:01 EDT'), 1)
self.assertEqual(unified_timestamp('Wednesday 31 December 1969 18:01:26 MDT'), 86)
self.assertEqual(unified_timestamp('12/31/1969 20:01:18 EDT', False), 78)
def test_determine_ext(self): def test_determine_ext(self):
self.assertEqual(determine_ext('http://example.com/foo/bar.mp4/?download'), 'mp4') self.assertEqual(determine_ext('http://example.com/foo/bar.mp4/?download'), 'mp4')
self.assertEqual(determine_ext('http://example.com/foo/bar/?download', None), None) self.assertEqual(determine_ext('http://example.com/foo/bar/?download', None), None)
@ -562,6 +568,7 @@ def test_base_url(self):
self.assertEqual(base_url('http://foo.de/bar/'), 'http://foo.de/bar/') self.assertEqual(base_url('http://foo.de/bar/'), 'http://foo.de/bar/')
self.assertEqual(base_url('http://foo.de/bar/baz'), 'http://foo.de/bar/') self.assertEqual(base_url('http://foo.de/bar/baz'), 'http://foo.de/bar/')
self.assertEqual(base_url('http://foo.de/bar/baz?x=z/x/c'), 'http://foo.de/bar/') self.assertEqual(base_url('http://foo.de/bar/baz?x=z/x/c'), 'http://foo.de/bar/')
self.assertEqual(base_url('http://foo.de/bar/baz&x=z&w=y/x/c'), 'http://foo.de/bar/baz&x=z&w=y/x/')
def test_urljoin(self): def test_urljoin(self):
self.assertEqual(urljoin('http://foo.de/', '/a/b/c.txt'), 'http://foo.de/a/b/c.txt') self.assertEqual(urljoin('http://foo.de/', '/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
@ -1093,6 +1100,12 @@ def test_js_to_json_edgecases(self):
on = js_to_json('[1,//{},\n2]') on = js_to_json('[1,//{},\n2]')
self.assertEqual(json.loads(on), [1, 2]) self.assertEqual(json.loads(on), [1, 2])
on = js_to_json(R'"\^\$\#"')
self.assertEqual(json.loads(on), R'^$#', msg='Unnecessary escapes should be stripped')
on = js_to_json('\'"\\""\'')
self.assertEqual(json.loads(on), '"""', msg='Unnecessary quote escape should be escaped')
def test_js_to_json_malformed(self): def test_js_to_json_malformed(self):
self.assertEqual(js_to_json('42a1'), '42"a1"') self.assertEqual(js_to_json('42a1'), '42"a1"')
self.assertEqual(js_to_json('42a-1'), '42"a"-1') self.assertEqual(js_to_json('42a-1'), '42"a"-1')
@ -1672,6 +1685,9 @@ def test_get_elements_text_and_html_by_attribute(self):
self.assertEqual(list(get_elements_text_and_html_by_attribute('class', 'foo', html)), []) self.assertEqual(list(get_elements_text_and_html_by_attribute('class', 'foo', html)), [])
self.assertEqual(list(get_elements_text_and_html_by_attribute('class', 'no-such-foo', html)), []) self.assertEqual(list(get_elements_text_and_html_by_attribute('class', 'no-such-foo', html)), [])
self.assertEqual(list(get_elements_text_and_html_by_attribute(
'class', 'foo', '<a class="foo">nice</a><span class="foo">nice</span>', tag='a')), [('nice', '<a class="foo">nice</a>')])
GET_ELEMENT_BY_TAG_TEST_STRING = ''' GET_ELEMENT_BY_TAG_TEST_STRING = '''
random text lorem ipsum</p> random text lorem ipsum</p>
<div> <div>
@ -1869,6 +1885,230 @@ def test_get_compatible_ext(self):
self.assertEqual(get_compatible_ext( self.assertEqual(get_compatible_ext(
vcodecs=['av1'], acodecs=['mp4a'], vexts=['webm'], aexts=['m4a'], preferences=('webm', 'mkv')), 'mkv') vcodecs=['av1'], acodecs=['mp4a'], vexts=['webm'], aexts=['m4a'], preferences=('webm', 'mkv')), 'mkv')
def test_traverse_obj(self):
_TEST_DATA = {
100: 100,
1.2: 1.2,
'str': 'str',
'None': None,
'...': ...,
'urls': [
{'index': 0, 'url': 'https://www.example.com/0'},
{'index': 1, 'url': 'https://www.example.com/1'},
],
'data': (
{'index': 2},
{'index': 3},
),
'dict': {},
}
# Test base functionality
self.assertEqual(traverse_obj(_TEST_DATA, ('str',)), 'str',
msg='allow tuple path')
self.assertEqual(traverse_obj(_TEST_DATA, ['str']), 'str',
msg='allow list path')
self.assertEqual(traverse_obj(_TEST_DATA, (value for value in ("str",))), 'str',
msg='allow iterable path')
self.assertEqual(traverse_obj(_TEST_DATA, 'str'), 'str',
msg='single items should be treated as a path')
self.assertEqual(traverse_obj(_TEST_DATA, None), _TEST_DATA)
self.assertEqual(traverse_obj(_TEST_DATA, 100), 100)
self.assertEqual(traverse_obj(_TEST_DATA, 1.2), 1.2)
# Test Ellipsis behavior
self.assertCountEqual(traverse_obj(_TEST_DATA, ...),
(item for item in _TEST_DATA.values() if item is not None),
msg='`...` should give all values except `None`')
self.assertCountEqual(traverse_obj(_TEST_DATA, ('urls', 0, ...)), _TEST_DATA['urls'][0].values(),
msg='`...` selection for dicts should select all values')
self.assertEqual(traverse_obj(_TEST_DATA, (..., ..., 'url')),
['https://www.example.com/0', 'https://www.example.com/1'],
msg='nested `...` queries should work')
self.assertCountEqual(traverse_obj(_TEST_DATA, (..., ..., 'index')), range(4),
msg='`...` query result should be flattened')
# Test function as key
self.assertEqual(traverse_obj(_TEST_DATA, lambda x, y: x == 'urls' and isinstance(y, list)),
[_TEST_DATA['urls']],
msg='function as query key should perform a filter based on (key, value)')
self.assertCountEqual(traverse_obj(_TEST_DATA, lambda _, x: isinstance(x[0], str)), {'str'},
msg='exceptions in the query function should be catched')
# Test alternative paths
self.assertEqual(traverse_obj(_TEST_DATA, 'fail', 'str'), 'str',
msg='multiple `paths` should be treated as alternative paths')
self.assertEqual(traverse_obj(_TEST_DATA, 'str', 100), 'str',
msg='alternatives should exit early')
self.assertEqual(traverse_obj(_TEST_DATA, 'fail', 'fail'), None,
msg='alternatives should return `default` if exhausted')
self.assertEqual(traverse_obj(_TEST_DATA, (..., 'fail'), 100), 100,
msg='alternatives should track their own branching return')
self.assertEqual(traverse_obj(_TEST_DATA, ('dict', ...), ('data', ...)), list(_TEST_DATA['data']),
msg='alternatives on empty objects should search further')
# Test branch and path nesting
self.assertEqual(traverse_obj(_TEST_DATA, ('urls', (3, 0), 'url')), ['https://www.example.com/0'],
msg='tuple as key should be treated as branches')
self.assertEqual(traverse_obj(_TEST_DATA, ('urls', [3, 0], 'url')), ['https://www.example.com/0'],
msg='list as key should be treated as branches')
self.assertEqual(traverse_obj(_TEST_DATA, ('urls', ((1, 'fail'), (0, 'url')))), ['https://www.example.com/0'],
msg='double nesting in path should be treated as paths')
self.assertEqual(traverse_obj(['0', [1, 2]], [(0, 1), 0]), [1],
msg='do not fail early on branching')
self.assertCountEqual(traverse_obj(_TEST_DATA, ('urls', ((1, ('fail', 'url')), (0, 'url')))),
['https://www.example.com/0', 'https://www.example.com/1'],
msg='tripple nesting in path should be treated as branches')
self.assertEqual(traverse_obj(_TEST_DATA, ('urls', ('fail', (..., 'url')))),
['https://www.example.com/0', 'https://www.example.com/1'],
msg='ellipsis as branch path start gets flattened')
# Test dictionary as key
self.assertEqual(traverse_obj(_TEST_DATA, {0: 100, 1: 1.2}), {0: 100, 1: 1.2},
msg='dict key should result in a dict with the same keys')
self.assertEqual(traverse_obj(_TEST_DATA, {0: ('urls', 0, 'url')}),
{0: 'https://www.example.com/0'},
msg='dict key should allow paths')
self.assertEqual(traverse_obj(_TEST_DATA, {0: ('urls', (3, 0), 'url')}),
{0: ['https://www.example.com/0']},
msg='tuple in dict path should be treated as branches')
self.assertEqual(traverse_obj(_TEST_DATA, {0: ('urls', ((1, 'fail'), (0, 'url')))}),
{0: ['https://www.example.com/0']},
msg='double nesting in dict path should be treated as paths')
self.assertEqual(traverse_obj(_TEST_DATA, {0: ('urls', ((1, ('fail', 'url')), (0, 'url')))}),
{0: ['https://www.example.com/1', 'https://www.example.com/0']},
msg='tripple nesting in dict path should be treated as branches')
self.assertEqual(traverse_obj(_TEST_DATA, {0: 'fail'}), {},
msg='remove `None` values when dict key')
self.assertEqual(traverse_obj(_TEST_DATA, {0: 'fail'}, default=...), {0: ...},
msg='do not remove `None` values if `default`')
self.assertEqual(traverse_obj(_TEST_DATA, {0: 'dict'}), {0: {}},
msg='do not remove empty values when dict key')
self.assertEqual(traverse_obj(_TEST_DATA, {0: 'dict'}, default=...), {0: {}},
msg='do not remove empty values when dict key and a default')
self.assertEqual(traverse_obj(_TEST_DATA, {0: ('dict', ...)}), {0: []},
msg='if branch in dict key not successful, return `[]`')
# Testing default parameter behavior
_DEFAULT_DATA = {'None': None, 'int': 0, 'list': []}
self.assertEqual(traverse_obj(_DEFAULT_DATA, 'fail'), None,
msg='default value should be `None`')
self.assertEqual(traverse_obj(_DEFAULT_DATA, 'fail', 'fail', default=...), ...,
msg='chained fails should result in default')
self.assertEqual(traverse_obj(_DEFAULT_DATA, 'None', 'int'), 0,
msg='should not short cirquit on `None`')
self.assertEqual(traverse_obj(_DEFAULT_DATA, 'fail', default=1), 1,
msg='invalid dict key should result in `default`')
self.assertEqual(traverse_obj(_DEFAULT_DATA, 'None', default=1), 1,
msg='`None` is a deliberate sentinel and should become `default`')
self.assertEqual(traverse_obj(_DEFAULT_DATA, ('list', 10)), None,
msg='`IndexError` should result in `default`')
self.assertEqual(traverse_obj(_DEFAULT_DATA, (..., 'fail'), default=1), 1,
msg='if branched but not successful return `default` if defined, not `[]`')
self.assertEqual(traverse_obj(_DEFAULT_DATA, (..., 'fail'), default=None), None,
msg='if branched but not successful return `default` even if `default` is `None`')
self.assertEqual(traverse_obj(_DEFAULT_DATA, (..., 'fail')), [],
msg='if branched but not successful return `[]`, not `default`')
self.assertEqual(traverse_obj(_DEFAULT_DATA, ('list', ...)), [],
msg='if branched but object is empty return `[]`, not `default`')
# Testing expected_type behavior
_EXPECTED_TYPE_DATA = {'str': 'str', 'int': 0}
self.assertEqual(traverse_obj(_EXPECTED_TYPE_DATA, 'str', expected_type=str), 'str',
msg='accept matching `expected_type` type')
self.assertEqual(traverse_obj(_EXPECTED_TYPE_DATA, 'str', expected_type=int), None,
msg='reject non matching `expected_type` type')
self.assertEqual(traverse_obj(_EXPECTED_TYPE_DATA, 'int', expected_type=lambda x: str(x)), '0',
msg='transform type using type function')
self.assertEqual(traverse_obj(_EXPECTED_TYPE_DATA, 'str',
expected_type=lambda _: 1 / 0), None,
msg='wrap expected_type fuction in try_call')
self.assertEqual(traverse_obj(_EXPECTED_TYPE_DATA, ..., expected_type=str), ['str'],
msg='eliminate items that expected_type fails on')
# Test get_all behavior
_GET_ALL_DATA = {'key': [0, 1, 2]}
self.assertEqual(traverse_obj(_GET_ALL_DATA, ('key', ...), get_all=False), 0,
msg='if not `get_all`, return only first matching value')
self.assertEqual(traverse_obj(_GET_ALL_DATA, ..., get_all=False), [0, 1, 2],
msg='do not overflatten if not `get_all`')
# Test casesense behavior
_CASESENSE_DATA = {
'KeY': 'value0',
0: {
'KeY': 'value1',
0: {'KeY': 'value2'},
},
}
self.assertEqual(traverse_obj(_CASESENSE_DATA, 'key'), None,
msg='dict keys should be case sensitive unless `casesense`')
self.assertEqual(traverse_obj(_CASESENSE_DATA, 'keY',
casesense=False), 'value0',
msg='allow non matching key case if `casesense`')
self.assertEqual(traverse_obj(_CASESENSE_DATA, (0, ('keY',)),
casesense=False), ['value1'],
msg='allow non matching key case in branch if `casesense`')
self.assertEqual(traverse_obj(_CASESENSE_DATA, (0, ((0, 'keY'),)),
casesense=False), ['value2'],
msg='allow non matching key case in branch path if `casesense`')
# Test traverse_string behavior
_TRAVERSE_STRING_DATA = {'str': 'str', 1.2: 1.2}
self.assertEqual(traverse_obj(_TRAVERSE_STRING_DATA, ('str', 0)), None,
msg='do not traverse into string if not `traverse_string`')
self.assertEqual(traverse_obj(_TRAVERSE_STRING_DATA, ('str', 0),
traverse_string=True), 's',
msg='traverse into string if `traverse_string`')
self.assertEqual(traverse_obj(_TRAVERSE_STRING_DATA, (1.2, 1),
traverse_string=True), '.',
msg='traverse into converted data if `traverse_string`')
self.assertEqual(traverse_obj(_TRAVERSE_STRING_DATA, ('str', ...),
traverse_string=True), list('str'),
msg='`...` branching into string should result in list')
self.assertEqual(traverse_obj(_TRAVERSE_STRING_DATA, ('str', (0, 2)),
traverse_string=True), ['s', 'r'],
msg='branching into string should result in list')
self.assertEqual(traverse_obj(_TRAVERSE_STRING_DATA, ('str', lambda _, x: x),
traverse_string=True), list('str'),
msg='function branching into string should result in list')
# Test is_user_input behavior
_IS_USER_INPUT_DATA = {'range8': list(range(8))}
self.assertEqual(traverse_obj(_IS_USER_INPUT_DATA, ('range8', '3'),
is_user_input=True), 3,
msg='allow for string indexing if `is_user_input`')
self.assertCountEqual(traverse_obj(_IS_USER_INPUT_DATA, ('range8', '3:'),
is_user_input=True), tuple(range(8))[3:],
msg='allow for string slice if `is_user_input`')
self.assertCountEqual(traverse_obj(_IS_USER_INPUT_DATA, ('range8', ':4:2'),
is_user_input=True), tuple(range(8))[:4:2],
msg='allow step in string slice if `is_user_input`')
self.assertCountEqual(traverse_obj(_IS_USER_INPUT_DATA, ('range8', ':'),
is_user_input=True), range(8),
msg='`:` should be treated as `...` if `is_user_input`')
with self.assertRaises(TypeError, msg='too many params should result in error'):
traverse_obj(_IS_USER_INPUT_DATA, ('range8', ':::'), is_user_input=True)
# Test re.Match as input obj
mobj = re.fullmatch(r'0(12)(?P<group>3)(4)?', '0123')
self.assertEqual(traverse_obj(mobj, ...), [x for x in mobj.groups() if x is not None],
msg='`...` on a `re.Match` should give its `groups()`')
self.assertEqual(traverse_obj(mobj, lambda k, _: k in (0, 2)), ['0123', '3'],
msg='function on a `re.Match` should give groupno, value starting at 0')
self.assertEqual(traverse_obj(mobj, 'group'), '3',
msg='str key on a `re.Match` should give group with that name')
self.assertEqual(traverse_obj(mobj, 2), '3',
msg='int key on a `re.Match` should give group with that name')
self.assertEqual(traverse_obj(mobj, 'gRoUp', casesense=False), '3',
msg='str key on a `re.Match` should respect casesense')
self.assertEqual(traverse_obj(mobj, 'fail'), None,
msg='failing str key on a `re.Match` should return `default`')
self.assertEqual(traverse_obj(mobj, 'gRoUpS', casesense=False), None,
msg='failing str key on a `re.Match` should return `default`')
self.assertEqual(traverse_obj(mobj, 8), None,
msg='failing int key on a `re.Match` should return `default`')
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@ -94,6 +94,46 @@
'https://www.youtube.com/s/player/5dd88d1d/player-plasma-ias-phone-en_US.vflset/base.js', 'https://www.youtube.com/s/player/5dd88d1d/player-plasma-ias-phone-en_US.vflset/base.js',
'kSxKFLeqzv_ZyHSAt', 'n8gS8oRlHOxPFA', 'kSxKFLeqzv_ZyHSAt', 'n8gS8oRlHOxPFA',
), ),
(
'https://www.youtube.com/s/player/324f67b9/player_ias.vflset/en_US/base.js',
'xdftNy7dh9QGnhW', '22qLGxrmX8F1rA',
),
(
'https://www.youtube.com/s/player/4c3f79c5/player_ias.vflset/en_US/base.js',
'TDCstCG66tEAO5pR9o', 'dbxNtZ14c-yWyw',
),
(
'https://www.youtube.com/s/player/c81bbb4a/player_ias.vflset/en_US/base.js',
'gre3EcLurNY2vqp94', 'Z9DfGxWP115WTg',
),
(
'https://www.youtube.com/s/player/1f7d5369/player_ias.vflset/en_US/base.js',
'batNX7sYqIJdkJ', 'IhOkL_zxbkOZBw',
),
(
'https://www.youtube.com/s/player/009f1d77/player_ias.vflset/en_US/base.js',
'5dwFHw8aFWQUQtffRq', 'audescmLUzI3jw',
),
(
'https://www.youtube.com/s/player/dc0c6770/player_ias.vflset/en_US/base.js',
'5EHDMgYLV6HPGk_Mu-kk', 'n9lUJLHbxUI0GQ',
),
(
'https://www.youtube.com/s/player/113ca41c/player_ias.vflset/en_US/base.js',
'cgYl-tlYkhjT7A', 'hI7BBr2zUgcmMg',
),
(
'https://www.youtube.com/s/player/c57c113c/player_ias.vflset/en_US/base.js',
'M92UUMHa8PdvPd3wyM', '3hPqLJsiNZx7yA',
),
(
'https://www.youtube.com/s/player/5a3b6271/player_ias.vflset/en_US/base.js',
'B2j7f_UPT4rfje85Lu_e', 'm5DmNymaGQ5RdQ',
),
(
'https://www.youtube.com/s/player/7a062b77/player_ias.vflset/en_US/base.js',
'NRcE3y3mVtm_cV-W', 'VbsCYUATvqlt5w',
),
] ]
@ -101,6 +141,7 @@
class TestPlayerInfo(unittest.TestCase): class TestPlayerInfo(unittest.TestCase):
def test_youtube_extract_player_info(self): def test_youtube_extract_player_info(self):
PLAYER_URLS = ( PLAYER_URLS = (
('https://www.youtube.com/s/player/4c3f79c5/player_ias.vflset/en_US/base.js', '4c3f79c5'),
('https://www.youtube.com/s/player/64dddad9/player_ias.vflset/en_US/base.js', '64dddad9'), ('https://www.youtube.com/s/player/64dddad9/player_ias.vflset/en_US/base.js', '64dddad9'),
('https://www.youtube.com/s/player/64dddad9/player_ias.vflset/fr_FR/base.js', '64dddad9'), ('https://www.youtube.com/s/player/64dddad9/player_ias.vflset/fr_FR/base.js', '64dddad9'),
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-en_US.vflset/base.js', '64dddad9'), ('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-en_US.vflset/base.js', '64dddad9'),

1
test/testdata/ism/ec-3_test.Manifest vendored Normal file
View File

@ -0,0 +1 @@
<?xml version="1.0" encoding="utf-8"?><!--Transformed by VSMT using XSL stylesheet for rule Identity--><!-- Created with Unified Streaming Platform (version=1.10.12-18737) --><SmoothStreamingMedia MajorVersion="2" MinorVersion="0" TimeScale="10000000" Duration="370000000"><StreamIndex Type="audio" QualityLevels="1" TimeScale="10000000" Language="deu" Name="audio_deu" Chunks="19" Url="QualityLevels({bitrate})/Fragments(audio_deu={start time})?noStreamProfile=1"><QualityLevel Index="0" Bitrate="127802" CodecPrivateData="1190" SamplingRate="48000" Channels="2" BitsPerSample="16" PacketSize="4" AudioTag="255" FourCC="AACL" /><c t="0" d="20053333" /><c d="20053334" /><c d="20053333" /><c d="19840000" /><c d="20053333" /><c d="20053334" /><c d="20053333" /><c d="19840000" /><c d="20053333" /><c d="20053334" /><c d="20053333" /><c d="19840000" /><c d="20053333" /><c d="20053334" /><c d="20053333" /><c d="19840000" /><c d="20053333" /><c d="20053334" /><c d="7253333" /></StreamIndex><StreamIndex Type="audio" QualityLevels="1" TimeScale="10000000" Language="deu" Name="audio_deu_1" Chunks="19" Url="QualityLevels({bitrate})/Fragments(audio_deu_1={start time})?noStreamProfile=1"><QualityLevel Index="0" Bitrate="224000" CodecPrivateData="00063F000000AF87FBA7022DFB42A4D405CD93843BDD0700200F00" FourCCData="0700200F00" SamplingRate="48000" Channels="6" BitsPerSample="16" PacketSize="896" AudioTag="65534" FourCC="EC-3" /><c t="0" d="20160000" /><c d="19840000" /><c d="20160000" /><c d="19840000" /><c d="20160000" /><c d="19840000" /><c d="20160000" /><c d="19840000" /><c d="20160000" /><c d="19840000" /><c d="20160000" /><c d="19840000" /><c d="20160000" /><c d="19840000" /><c d="20160000" /><c d="19840000" /><c d="20160000" /><c d="19840000" /><c d="8320000" /></StreamIndex><StreamIndex Type="video" QualityLevels="8" TimeScale="10000000" Language="deu" Name="video_deu" Chunks="19" Url="QualityLevels({bitrate})/Fragments(video_deu={start time})?noStreamProfile=1" MaxWidth="1920" MaxHeight="1080" DisplayWidth="1920" DisplayHeight="1080"><QualityLevel Index="0" Bitrate="23909" CodecPrivateData="000000016742C00CDB06077E5C05A808080A00000300020000030009C0C02EE0177CC6300F142AE00000000168CA8DC8" MaxWidth="384" MaxHeight="216" FourCC="AVC1" /><QualityLevel Index="1" Bitrate="403188" CodecPrivateData="00000001674D4014E98323B602D4040405000003000100000300320F1429380000000168EAECF2" MaxWidth="400" MaxHeight="224" FourCC="AVC1" /><QualityLevel Index="2" Bitrate="680365" CodecPrivateData="00000001674D401EE981405FF2E02D4040405000000300100000030320F162D3800000000168EAECF2" MaxWidth="640" MaxHeight="360" FourCC="AVC1" /><QualityLevel Index="3" Bitrate="1253465" CodecPrivateData="00000001674D401EE981405FF2E02D4040405000000300100000030320F162D3800000000168EAECF2" MaxWidth="640" MaxHeight="360" FourCC="AVC1" /><QualityLevel Index="4" Bitrate="2121558" CodecPrivateData="00000001674D401EECA0601BD80B50101014000003000400000300C83C58B6580000000168E93B3C80" MaxWidth="768" MaxHeight="432" FourCC="AVC1" /><QualityLevel Index="5" Bitrate="3275545" CodecPrivateData="00000001674D4020ECA02802DD80B501010140000003004000000C83C60C65800000000168E93B3C80" MaxWidth="1280" MaxHeight="720" FourCC="AVC1" /><QualityLevel Index="6" Bitrate="5300196" CodecPrivateData="00000001674D4028ECA03C0113F2E02D4040405000000300100000030320F18319600000000168E93B3C80" MaxWidth="1920" MaxHeight="1080" FourCC="AVC1" /><QualityLevel Index="7" Bitrate="8079312" CodecPrivateData="00000001674D4028ECA03C0113F2E02D4040405000000300100000030320F18319600000000168E93B3C80" MaxWidth="1920" MaxHeight="1080" FourCC="AVC1" /><c t="0" d="20000000" /><c d="20000000" /><c d="20000000" /><c d="20000000" /><c d="20000000" /><c d="20000000" /><c d="20000000" /><c d="20000000" /><c d="20000000" /><c d="20000000" /><c d="20000000" /><c d="20000000" /><c d="20000000" /><c d="20000000" /><c d="20000000" /><c d="20000000" /><c d="20000000" /><c d="20000000" /><c d="10000000" /></StreamIndex></SmoothStreamingMedia>

View File

@ -29,6 +29,7 @@
from .downloader import FFmpegFD, get_suitable_downloader, shorten_protocol_name from .downloader import FFmpegFD, get_suitable_downloader, shorten_protocol_name
from .downloader.rtmp import rtmpdump_version from .downloader.rtmp import rtmpdump_version
from .extractor import gen_extractor_classes, get_info_extractor from .extractor import gen_extractor_classes, get_info_extractor
from .extractor.common import UnsupportedURLIE
from .extractor.openload import PhantomJSwrapper from .extractor.openload import PhantomJSwrapper
from .minicurses import format_text from .minicurses import format_text
from .postprocessor import _PLUGIN_CLASSES as plugin_postprocessors from .postprocessor import _PLUGIN_CLASSES as plugin_postprocessors
@ -47,7 +48,7 @@
get_postprocessor, get_postprocessor,
) )
from .postprocessor.ffmpeg import resolve_mapping as resolve_recode_mapping from .postprocessor.ffmpeg import resolve_mapping as resolve_recode_mapping
from .update import detect_variant from .update import REPOSITORY, current_git_head, detect_variant
from .utils import ( from .utils import (
DEFAULT_OUTTMPL, DEFAULT_OUTTMPL,
IDENTITY, IDENTITY,
@ -89,6 +90,7 @@
args_to_str, args_to_str,
bug_reports_message, bug_reports_message,
date_from_str, date_from_str,
deprecation_warning,
determine_ext, determine_ext,
determine_protocol, determine_protocol,
encode_compat_str, encode_compat_str,
@ -106,6 +108,7 @@
get_domain, get_domain,
int_or_none, int_or_none,
iri_to_uri, iri_to_uri,
is_path_like,
join_nonempty, join_nonempty,
locked_file, locked_file,
make_archive_id, make_archive_id,
@ -115,6 +118,7 @@
network_exceptions, network_exceptions,
number_of_digits, number_of_digits,
orderedSet, orderedSet,
orderedSet_from_options,
parse_filesize, parse_filesize,
preferredencoding, preferredencoding,
prepend_extension, prepend_extension,
@ -144,7 +148,7 @@
write_json_file, write_json_file,
write_string, write_string,
) )
from .version import RELEASE_GIT_HEAD, __version__ from .version import RELEASE_GIT_HEAD, VARIANT, __version__
if compat_os_name == 'nt': if compat_os_name == 'nt':
import ctypes import ctypes
@ -236,7 +240,7 @@ class YoutubeDL:
Default is 'only_download' for CLI, but False for API Default is 'only_download' for CLI, but False for API
skip_playlist_after_errors: Number of allowed failures until the rest of skip_playlist_after_errors: Number of allowed failures until the rest of
the playlist is skipped the playlist is skipped
force_generic_extractor: Force downloader to use the generic extractor allowed_extractors: List of regexes to match against extractor names that are allowed
overwrites: Overwrite all video and metadata files if True, overwrites: Overwrite all video and metadata files if True,
overwrite only non-video files if None overwrite only non-video files if None
and don't overwrite any file if False and don't overwrite any file if False
@ -248,7 +252,7 @@ class YoutubeDL:
matchtitle: Download only matching titles. matchtitle: Download only matching titles.
rejecttitle: Reject downloads for matching titles. rejecttitle: Reject downloads for matching titles.
logger: Log messages to a logging.Logger instance. logger: Log messages to a logging.Logger instance.
logtostderr: Log messages to stderr instead of stdout. logtostderr: Print everything to stderr instead of stdout.
consoletitle: Display progress in console window's titlebar. consoletitle: Display progress in console window's titlebar.
writedescription: Write the video description to a .description file writedescription: Write the video description to a .description file
writeinfojson: Write the video description to a .info.json file writeinfojson: Write the video description to a .info.json file
@ -272,7 +276,7 @@ class YoutubeDL:
subtitleslangs: List of languages of the subtitles to download (can be regex). subtitleslangs: List of languages of the subtitles to download (can be regex).
The list may contain "all" to refer to all the available The list may contain "all" to refer to all the available
subtitles. The language can be prefixed with a "-" to subtitles. The language can be prefixed with a "-" to
exclude it from the requested languages. Eg: ['all', '-live_chat'] exclude it from the requested languages, e.g. ['all', '-live_chat']
keepvideo: Keep the video file after post-processing keepvideo: Keep the video file after post-processing
daterange: A DateRange object, download only if the upload_date is in the range. daterange: A DateRange object, download only if the upload_date is in the range.
skip_download: Skip the actual download of the video file skip_download: Skip the actual download of the video file
@ -290,9 +294,8 @@ class YoutubeDL:
downloaded. downloaded.
Videos without view count information are always Videos without view count information are always
downloaded. None for no limit. downloaded. None for no limit.
download_archive: File name of a file where all downloads are recorded. download_archive: A set, or the name of a file where all downloads are recorded.
Videos already present in the file are not downloaded Videos already present in the file are not downloaded again.
again.
break_on_existing: Stop the download process after attempting to download a break_on_existing: Stop the download process after attempting to download a
file that is in the archive. file that is in the archive.
break_on_reject: Stop the download process when encountering a video that break_on_reject: Stop the download process when encountering a video that
@ -301,8 +304,9 @@ class YoutubeDL:
should act on each input URL as opposed to for the entire queue should act on each input URL as opposed to for the entire queue
cookiefile: File name or text stream from where cookies should be read and dumped to cookiefile: File name or text stream from where cookies should be read and dumped to
cookiesfrombrowser: A tuple containing the name of the browser, the profile cookiesfrombrowser: A tuple containing the name of the browser, the profile
name/pathfrom where cookies are loaded, and the name of the name/path from where cookies are loaded, the name of the keyring,
keyring. Eg: ('chrome', ) or ('vivaldi', 'default', 'BASICTEXT') and the container name, e.g. ('chrome', ) or
('vivaldi', 'default', 'BASICTEXT') or ('firefox', 'default', None, 'Meta')
legacyserverconnect: Explicitly allow HTTPS connection to servers that do not legacyserverconnect: Explicitly allow HTTPS connection to servers that do not
support RFC 5746 secure renegotiation support RFC 5746 secure renegotiation
nocheckcertificate: Do not verify SSL certificates nocheckcertificate: Do not verify SSL certificates
@ -444,6 +448,7 @@ class YoutubeDL:
* index: Section number (Optional) * index: Section number (Optional)
force_keyframes_at_cuts: Re-encode the video when downloading ranges to get precise cuts force_keyframes_at_cuts: Re-encode the video when downloading ranges to get precise cuts
noprogress: Do not print the progress bar noprogress: Do not print the progress bar
live_from_start: Whether to download livestreams videos from the start
The following parameters are not used by YoutubeDL itself, they are used by The following parameters are not used by YoutubeDL itself, they are used by
the downloader (see yt_dlp/downloader/common.py): the downloader (see yt_dlp/downloader/common.py):
@ -470,11 +475,13 @@ class YoutubeDL:
discontinuities such as ad breaks (default: False) discontinuities such as ad breaks (default: False)
extractor_args: A dictionary of arguments to be passed to the extractors. extractor_args: A dictionary of arguments to be passed to the extractors.
See "EXTRACTOR ARGUMENTS" for details. See "EXTRACTOR ARGUMENTS" for details.
Eg: {'youtube': {'skip': ['dash', 'hls']}} E.g. {'youtube': {'skip': ['dash', 'hls']}}
mark_watched: Mark videos watched (even with --simulate). Only for YouTube mark_watched: Mark videos watched (even with --simulate). Only for YouTube
The following options are deprecated and may be removed in the future: The following options are deprecated and may be removed in the future:
force_generic_extractor: Force downloader to use the generic extractor
- Use allowed_extractors = ['generic', 'default']
playliststart: - Use playlist_items playliststart: - Use playlist_items
Playlist item to start at. Playlist item to start at.
playlistend: - Use playlist_items playlistend: - Use playlist_items
@ -527,7 +534,8 @@ class YoutubeDL:
""" """
_NUMERIC_FIELDS = { _NUMERIC_FIELDS = {
'width', 'height', 'tbr', 'abr', 'asr', 'vbr', 'fps', 'filesize', 'filesize_approx', 'width', 'height', 'asr', 'audio_channels', 'fps',
'tbr', 'abr', 'vbr', 'filesize', 'filesize_approx',
'timestamp', 'release_timestamp', 'timestamp', 'release_timestamp',
'duration', 'view_count', 'like_count', 'dislike_count', 'repost_count', 'duration', 'view_count', 'like_count', 'dislike_count', 'repost_count',
'average_rating', 'comment_count', 'age_limit', 'average_rating', 'comment_count', 'age_limit',
@ -539,8 +547,8 @@ class YoutubeDL:
_format_fields = { _format_fields = {
# NB: Keep in sync with the docstring of extractor/common.py # NB: Keep in sync with the docstring of extractor/common.py
'url', 'manifest_url', 'manifest_stream_number', 'ext', 'format', 'format_id', 'format_note', 'url', 'manifest_url', 'manifest_stream_number', 'ext', 'format', 'format_id', 'format_note',
'width', 'height', 'resolution', 'dynamic_range', 'tbr', 'abr', 'acodec', 'asr', 'width', 'height', 'resolution', 'dynamic_range', 'tbr', 'abr', 'acodec', 'asr', 'audio_channels',
'vbr', 'fps', 'vcodec', 'container', 'filesize', 'filesize_approx', 'vbr', 'fps', 'vcodec', 'container', 'filesize', 'filesize_approx', 'rows', 'columns',
'player_url', 'protocol', 'fragment_base_url', 'fragments', 'is_from_start', 'player_url', 'protocol', 'fragment_base_url', 'fragments', 'is_from_start',
'preference', 'language', 'language_preference', 'quality', 'source_preference', 'preference', 'language', 'language_preference', 'quality', 'source_preference',
'http_headers', 'stretched_ratio', 'no_resume', 'has_drm', 'downloader_options', 'http_headers', 'stretched_ratio', 'no_resume', 'has_drm', 'downloader_options',
@ -625,7 +633,7 @@ def check_deprecated(param, option, suggestion):
for msg in self.params.get('_warnings', []): for msg in self.params.get('_warnings', []):
self.report_warning(msg) self.report_warning(msg)
for msg in self.params.get('_deprecation_warnings', []): for msg in self.params.get('_deprecation_warnings', []):
self.deprecation_warning(msg) self.deprecated_feature(msg)
self.params['compat_opts'] = set(self.params.get('compat_opts', ())) self.params['compat_opts'] = set(self.params.get('compat_opts', ()))
if 'list-formats' in self.params['compat_opts']: if 'list-formats' in self.params['compat_opts']:
@ -715,21 +723,23 @@ def check_deprecated(param, option, suggestion):
def preload_download_archive(fn): def preload_download_archive(fn):
"""Preload the archive, if any is specified""" """Preload the archive, if any is specified"""
archive = set()
if fn is None: if fn is None:
return False return archive
elif not is_path_like(fn):
return fn
self.write_debug(f'Loading archive file {fn!r}') self.write_debug(f'Loading archive file {fn!r}')
try: try:
with locked_file(fn, 'r', encoding='utf-8') as archive_file: with locked_file(fn, 'r', encoding='utf-8') as archive_file:
for line in archive_file: for line in archive_file:
self.archive.add(line.strip()) archive.add(line.strip())
except OSError as ioe: except OSError as ioe:
if ioe.errno != errno.ENOENT: if ioe.errno != errno.ENOENT:
raise raise
return False return archive
return True
self.archive = set() self.archive = preload_download_archive(self.params.get('download_archive'))
preload_download_archive(self.params.get('download_archive'))
def warn_if_short_id(self, argv): def warn_if_short_id(self, argv):
# short YouTube ID starting with dash? # short YouTube ID starting with dash?
@ -755,13 +765,6 @@ def add_info_extractor(self, ie):
self._ies_instances[ie_key] = ie self._ies_instances[ie_key] = ie
ie.set_downloader(self) ie.set_downloader(self)
def _get_info_extractor_class(self, ie_key):
ie = self._ies.get(ie_key)
if ie is None:
ie = get_info_extractor(ie_key)
self.add_info_extractor(ie)
return ie
def get_info_extractor(self, ie_key): def get_info_extractor(self, ie_key):
""" """
Get an instance of an IE with name ie_key, it will try to get one from Get an instance of an IE with name ie_key, it will try to get one from
@ -778,8 +781,19 @@ def add_default_info_extractors(self):
""" """
Add the InfoExtractors returned by gen_extractors to the end of the list Add the InfoExtractors returned by gen_extractors to the end of the list
""" """
for ie in gen_extractor_classes(): all_ies = {ie.IE_NAME.lower(): ie for ie in gen_extractor_classes()}
self.add_info_extractor(ie) all_ies['end'] = UnsupportedURLIE()
try:
ie_names = orderedSet_from_options(
self.params.get('allowed_extractors', ['default']), {
'all': list(all_ies),
'default': [name for name, ie in all_ies.items() if ie._ENABLED],
}, use_regex=True)
except re.error as e:
raise ValueError(f'Wrong regex for allowed_extractors: {e.pattern}')
for name in ie_names:
self.add_info_extractor(all_ies[name])
self.write_debug(f'Loaded {len(ie_names)} extractors')
def add_post_processor(self, pp, when='post_process'): def add_post_processor(self, pp, when='post_process'):
"""Add a PostProcessor object to the end of the chain.""" """Add a PostProcessor object to the end of the chain."""
@ -825,12 +839,14 @@ def _write_string(self, message, out=None, only_once=False):
def to_stdout(self, message, skip_eol=False, quiet=None): def to_stdout(self, message, skip_eol=False, quiet=None):
"""Print message to stdout""" """Print message to stdout"""
if quiet is not None: if quiet is not None:
self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument quiet. Use "YoutubeDL.to_screen" instead') self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument quiet. '
'Use "YoutubeDL.to_screen" instead')
if skip_eol is not False: if skip_eol is not False:
self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument skip_eol. Use "YoutubeDL.to_screen" instead') self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument skip_eol. '
'Use "YoutubeDL.to_screen" instead')
self._write_string(f'{self._bidi_workaround(message)}\n', self._out_files.out) self._write_string(f'{self._bidi_workaround(message)}\n', self._out_files.out)
def to_screen(self, message, skip_eol=False, quiet=None): def to_screen(self, message, skip_eol=False, quiet=None, only_once=False):
"""Print message to screen if not in quiet mode""" """Print message to screen if not in quiet mode"""
if self.params.get('logger'): if self.params.get('logger'):
self.params['logger'].debug(message) self.params['logger'].debug(message)
@ -839,7 +855,7 @@ def to_screen(self, message, skip_eol=False, quiet=None):
return return
self._write_string( self._write_string(
'%s%s' % (self._bidi_workaround(message), ('' if skip_eol else '\n')), '%s%s' % (self._bidi_workaround(message), ('' if skip_eol else '\n')),
self._out_files.screen) self._out_files.screen, only_once=only_once)
def to_stderr(self, message, only_once=False): def to_stderr(self, message, only_once=False):
"""Print message to stderr""" """Print message to stderr"""
@ -963,11 +979,14 @@ def report_warning(self, message, only_once=False):
return return
self.to_stderr(f'{self._format_err("WARNING:", self.Styles.WARNING)} {message}', only_once) self.to_stderr(f'{self._format_err("WARNING:", self.Styles.WARNING)} {message}', only_once)
def deprecation_warning(self, message): def deprecation_warning(self, message, *, stacklevel=0):
deprecation_warning(
message, stacklevel=stacklevel + 1, printer=self.report_error, is_error=False)
def deprecated_feature(self, message):
if self.params.get('logger') is not None: if self.params.get('logger') is not None:
self.params['logger'].warning(f'DeprecationWarning: {message}') self.params['logger'].warning(f'Deprecated Feature: {message}')
else: self.to_stderr(f'{self._format_err("Deprecated Feature:", self.Styles.ERROR)} {message}', True)
self.to_stderr(f'{self._format_err("DeprecationWarning:", self.Styles.ERROR)} {message}', True)
def report_error(self, message, *args, **kwargs): def report_error(self, message, *args, **kwargs):
''' '''
@ -1027,7 +1046,7 @@ def _parse_outtmpl(self):
def get_output_path(self, dir_type='', filename=None): def get_output_path(self, dir_type='', filename=None):
paths = self.params.get('paths', {}) paths = self.params.get('paths', {})
assert isinstance(paths, dict) assert isinstance(paths, dict), '"paths" parameter must be a dictionary'
path = os.path.join( path = os.path.join(
expand_path(paths.get('home', '').strip()), expand_path(paths.get('home', '').strip()),
expand_path(paths.get(dir_type, '').strip()) if dir_type else '', expand_path(paths.get(dir_type, '').strip()) if dir_type else '',
@ -1045,7 +1064,7 @@ def _outtmpl_expandpath(outtmpl):
# outtmpl should be expand_path'ed before template dict substitution # outtmpl should be expand_path'ed before template dict substitution
# because meta fields may contain env variables we don't want to # because meta fields may contain env variables we don't want to
# be expanded. For example, for outtmpl "%(title)s.%(ext)s" and # be expanded. E.g. for outtmpl "%(title)s.%(ext)s" and
# title "Hello $PATH", we don't want `$PATH` to be expanded. # title "Hello $PATH", we don't want `$PATH` to be expanded.
return expand_path(outtmpl).replace(sep, '') return expand_path(outtmpl).replace(sep, '')
@ -1110,8 +1129,12 @@ def prepare_outtmpl(self, outtmpl, info_dict, sanitize=False):
'-': float.__sub__, '-': float.__sub__,
} }
# Field is of the form key1.key2... # Field is of the form key1.key2...
# where keys (except first) can be string, int or slice # where keys (except first) can be string, int, slice or "{field, ...}"
FIELD_RE = r'\w*(?:\.(?:\w+|{num}|{num}?(?::{num}?){{1,2}}))*'.format(num=r'(?:-?\d+)') FIELD_INNER_RE = r'(?:\w+|%(num)s|%(num)s?(?::%(num)s?){1,2})' % {'num': r'(?:-?\d+)'}
FIELD_RE = r'\w*(?:\.(?:%(inner)s|{%(field)s(?:,%(field)s)*}))*' % {
'inner': FIELD_INNER_RE,
'field': rf'\w*(?:\.{FIELD_INNER_RE})*'
}
MATH_FIELD_RE = rf'(?:{FIELD_RE}|-?{NUMBER_RE})' MATH_FIELD_RE = rf'(?:{FIELD_RE}|-?{NUMBER_RE})'
MATH_OPERATORS_RE = r'(?:%s)' % '|'.join(map(re.escape, MATH_FUNCTIONS.keys())) MATH_OPERATORS_RE = r'(?:%s)' % '|'.join(map(re.escape, MATH_FUNCTIONS.keys()))
INTERNAL_FORMAT_RE = re.compile(rf'''(?x) INTERNAL_FORMAT_RE = re.compile(rf'''(?x)
@ -1125,11 +1148,20 @@ def prepare_outtmpl(self, outtmpl, info_dict, sanitize=False):
(?:\|(?P<default>.*?))? (?:\|(?P<default>.*?))?
)$''') )$''')
def _traverse_infodict(k): def _traverse_infodict(fields):
k = k.split('.') fields = [f for x in re.split(r'\.({.+?})\.?', fields)
if k[0] == '': for f in ([x] if x.startswith('{') else x.split('.'))]
k.pop(0) for i in (0, -1):
return traverse_obj(info_dict, k, is_user_input=True, traverse_string=True) if fields and not fields[i]:
fields.pop(i)
for i, f in enumerate(fields):
if not f.startswith('{'):
continue
assert f.endswith('}'), f'No closing brace for {f} in {fields}'
fields[i] = {k: k.split('.') for k in f[1:-1].split(',')}
return traverse_obj(info_dict, fields, is_user_input=True, traverse_string=True)
def get_value(mdict): def get_value(mdict):
# Object traversal # Object traversal
@ -1215,9 +1247,11 @@ def create_key(outer_mobj):
delim = '\n' if '#' in flags else ', ' delim = '\n' if '#' in flags else ', '
value, fmt = delim.join(map(str, variadic(value, allowed_types=(str, bytes)))), str_fmt value, fmt = delim.join(map(str, variadic(value, allowed_types=(str, bytes)))), str_fmt
elif fmt[-1] == 'j': # json elif fmt[-1] == 'j': # json
value, fmt = json.dumps(value, default=_dumpjson_default, indent=4 if '#' in flags else None), str_fmt value, fmt = json.dumps(
value, default=_dumpjson_default,
indent=4 if '#' in flags else None, ensure_ascii='+' not in flags), str_fmt
elif fmt[-1] == 'h': # html elif fmt[-1] == 'h': # html
value, fmt = escapeHTML(value), str_fmt value, fmt = escapeHTML(str(value)), str_fmt
elif fmt[-1] == 'q': # quoted elif fmt[-1] == 'q': # quoted
value = map(str, variadic(value) if '#' in flags else [value]) value = map(str, variadic(value) if '#' in flags else [value])
value, fmt = ' '.join(map(compat_shlex_quote, value)), str_fmt value, fmt = ' '.join(map(compat_shlex_quote, value)), str_fmt
@ -1389,18 +1423,19 @@ def add_extra_info(info_dict, extra_info):
def extract_info(self, url, download=True, ie_key=None, extra_info=None, def extract_info(self, url, download=True, ie_key=None, extra_info=None,
process=True, force_generic_extractor=False): process=True, force_generic_extractor=False):
""" """
Return a list with a dictionary for each video extracted. Extract and return the information dictionary of the URL
Arguments: Arguments:
url -- URL to extract @param url URL to extract
Keyword arguments: Keyword arguments:
download -- whether to download videos during extraction @param download Whether to download videos
ie_key -- extractor key hint @param process Whether to resolve all unresolved references (URLs, playlist items).
extra_info -- dictionary containing the extra values to add to each result Must be True for download to work
process -- whether to resolve all unresolved references (URLs, playlist items), @param ie_key Use only the extractor with this key
must be True for download to work.
force_generic_extractor -- force using the generic extractor @param extra_info Dictionary containing the extra values to add to the info (For internal use only)
@force_generic_extractor Force using the generic extractor (Deprecated; use ie_key='Generic')
""" """
if extra_info is None: if extra_info is None:
@ -1410,11 +1445,11 @@ def extract_info(self, url, download=True, ie_key=None, extra_info=None,
ie_key = 'Generic' ie_key = 'Generic'
if ie_key: if ie_key:
ies = {ie_key: self._get_info_extractor_class(ie_key)} ies = {ie_key: self._ies[ie_key]} if ie_key in self._ies else {}
else: else:
ies = self._ies ies = self._ies
for ie_key, ie in ies.items(): for key, ie in ies.items():
if not ie.suitable(url): if not ie.suitable(url):
continue continue
@ -1423,14 +1458,16 @@ def extract_info(self, url, download=True, ie_key=None, extra_info=None,
'and will probably not work.') 'and will probably not work.')
temp_id = ie.get_temp_id(url) temp_id = ie.get_temp_id(url)
if temp_id is not None and self.in_download_archive({'id': temp_id, 'ie_key': ie_key}): if temp_id is not None and self.in_download_archive({'id': temp_id, 'ie_key': key}):
self.to_screen(f'[{ie_key}] {temp_id}: has already been recorded in the archive') self.to_screen(f'[{key}] {temp_id}: has already been recorded in the archive')
if self.params.get('break_on_existing', False): if self.params.get('break_on_existing', False):
raise ExistingVideoReached() raise ExistingVideoReached()
break break
return self.__extract_info(url, self.get_info_extractor(ie_key), download, extra_info, process) return self.__extract_info(url, self.get_info_extractor(key), download, extra_info, process)
else: else:
self.report_error('no suitable InfoExtractor for URL %s' % url) extractors_restricted = self.params.get('allowed_extractors') not in (None, ['default'])
self.report_error(f'No suitable extractor{format_field(ie_key, None, " (%s)")} found for URL {url}',
tb=False if extractors_restricted else None)
def _handle_extraction_exceptions(func): def _handle_extraction_exceptions(func):
@functools.wraps(func) @functools.wraps(func)
@ -1584,6 +1621,7 @@ def process_ie_result(self, ie_result, download=True, extra_info=None):
self.add_default_extra_info(info_copy, ie, ie_result['url']) self.add_default_extra_info(info_copy, ie, ie_result['url'])
self.add_extra_info(info_copy, extra_info) self.add_extra_info(info_copy, extra_info)
info_copy, _ = self.pre_process(info_copy) info_copy, _ = self.pre_process(info_copy)
self._fill_common_fields(info_copy, False)
self.__forced_printings(info_copy, self.prepare_filename(info_copy), incomplete=True) self.__forced_printings(info_copy, self.prepare_filename(info_copy), incomplete=True)
self._raise_pending_errors(info_copy) self._raise_pending_errors(info_copy)
if self.params.get('force_write_download_archive', False): if self.params.get('force_write_download_archive', False):
@ -1650,8 +1688,8 @@ def process_ie_result(self, ie_result, download=True, extra_info=None):
elif result_type in ('playlist', 'multi_video'): elif result_type in ('playlist', 'multi_video'):
# Protect from infinite recursion due to recursively nested playlists # Protect from infinite recursion due to recursively nested playlists
# (see https://github.com/ytdl-org/youtube-dl/issues/27833) # (see https://github.com/ytdl-org/youtube-dl/issues/27833)
webpage_url = ie_result['webpage_url'] webpage_url = ie_result.get('webpage_url') # Playlists maynot have webpage_url
if webpage_url in self._playlist_urls: if webpage_url and webpage_url in self._playlist_urls:
self.to_screen( self.to_screen(
'[download] Skipping already downloaded playlist: %s' '[download] Skipping already downloaded playlist: %s'
% ie_result.get('title') or ie_result.get('id')) % ie_result.get('title') or ie_result.get('id'))
@ -1705,14 +1743,17 @@ def _playlist_infodict(ie_result, strict=False, **kwargs):
} }
if strict: if strict:
return info return info
if ie_result.get('webpage_url'):
info.update({
'webpage_url': ie_result['webpage_url'],
'webpage_url_basename': url_basename(ie_result['webpage_url']),
'webpage_url_domain': get_domain(ie_result['webpage_url']),
})
return { return {
**info, **info,
'playlist_index': 0, 'playlist_index': 0,
'__last_playlist_index': max(ie_result['requested_entries'] or (0, 0)), '__last_playlist_index': max(ie_result['requested_entries'] or (0, 0)),
'extractor': ie_result['extractor'], 'extractor': ie_result['extractor'],
'webpage_url': ie_result['webpage_url'],
'webpage_url_basename': url_basename(ie_result['webpage_url']),
'webpage_url_domain': get_domain(ie_result['webpage_url']),
'extractor_key': ie_result['extractor_key'], 'extractor_key': ie_result['extractor_key'],
} }
@ -1796,6 +1837,8 @@ def __process_playlist(self, ie_result, download):
}) })
if self._match_entry(entry_copy, incomplete=True) is not None: if self._match_entry(entry_copy, incomplete=True) is not None:
# For compatabilty with youtube-dl. See https://github.com/yt-dlp/yt-dlp/issues/4369
resolved_entries[i] = (playlist_index, NO_DEFAULT)
continue continue
self.to_screen('[download] Downloading video %s of %s' % ( self.to_screen('[download] Downloading video %s of %s' % (
@ -1816,7 +1859,8 @@ def __process_playlist(self, ie_result, download):
resolved_entries[i] = (playlist_index, entry_result) resolved_entries[i] = (playlist_index, entry_result)
# Update with processed data # Update with processed data
ie_result['requested_entries'], ie_result['entries'] = tuple(zip(*resolved_entries)) or ([], []) ie_result['requested_entries'] = [i for i, e in resolved_entries if e is not NO_DEFAULT]
ie_result['entries'] = [e for _, e in resolved_entries if e is not NO_DEFAULT]
# Write the updated info to json # Write the updated info to json
if _infojson_written is True and self._write_info_json( if _infojson_written is True and self._write_info_json(
@ -1973,8 +2017,8 @@ def _parse_filter(tokens):
filter_parts.append(string) filter_parts.append(string)
def _remove_unused_ops(tokens): def _remove_unused_ops(tokens):
# Remove operators that we don't use and join them with the surrounding strings # Remove operators that we don't use and join them with the surrounding strings.
# for example: 'mp4' '-' 'baseline' '-' '16x9' is converted to 'mp4-baseline-16x9' # E.g. 'mp4' '-' 'baseline' '-' '16x9' is converted to 'mp4-baseline-16x9'
ALLOWED_OPS = ('/', '+', ',', '(', ')') ALLOWED_OPS = ('/', '+', ',', '(', ')')
last_string, last_start, last_end, last_line = None, None, None, None last_string, last_start, last_end, last_line = None, None, None, None
for type, string, start, end, line in tokens: for type, string, start, end, line in tokens:
@ -2129,6 +2173,7 @@ def _merge(formats_pair):
'acodec': the_only_audio.get('acodec'), 'acodec': the_only_audio.get('acodec'),
'abr': the_only_audio.get('abr'), 'abr': the_only_audio.get('abr'),
'asr': the_only_audio.get('asr'), 'asr': the_only_audio.get('asr'),
'audio_channels': the_only_audio.get('audio_channels')
}) })
return new_dict return new_dict
@ -2335,10 +2380,9 @@ def check_thumbnails(thumbnails):
else: else:
info_dict['thumbnails'] = thumbnails info_dict['thumbnails'] = thumbnails
def _fill_common_fields(self, info_dict, is_video=True): def _fill_common_fields(self, info_dict, final=True):
# TODO: move sanitization here # TODO: move sanitization here
if is_video: if final:
# playlists are allowed to lack "title"
title = info_dict.get('title', NO_DEFAULT) title = info_dict.get('title', NO_DEFAULT)
if title is NO_DEFAULT: if title is NO_DEFAULT:
raise ExtractorError('Missing "title" field in extractor result', raise ExtractorError('Missing "title" field in extractor result',
@ -2382,11 +2426,13 @@ def _fill_common_fields(self, info_dict, is_video=True):
for key in live_keys: for key in live_keys:
if info_dict.get(key) is None: if info_dict.get(key) is None:
info_dict[key] = (live_status == key) info_dict[key] = (live_status == key)
if live_status == 'post_live':
info_dict['was_live'] = True
# Auto generate title fields corresponding to the *_number fields when missing # Auto generate title fields corresponding to the *_number fields when missing
# in order to always have clean titles. This is very common for TV series. # in order to always have clean titles. This is very common for TV series.
for field in ('chapter', 'season', 'episode'): for field in ('chapter', 'season', 'episode'):
if info_dict.get('%s_number' % field) is not None and not info_dict.get(field): if final and info_dict.get('%s_number' % field) is not None and not info_dict.get(field):
info_dict[field] = '%s %d' % (field.capitalize(), info_dict['%s_number' % field]) info_dict[field] = '%s %d' % (field.capitalize(), info_dict['%s_number' % field])
def _raise_pending_errors(self, info): def _raise_pending_errors(self, info):
@ -2479,21 +2525,17 @@ def sanitize_numeric_fields(info):
info_dict['requested_subtitles'] = self.process_subtitles( info_dict['requested_subtitles'] = self.process_subtitles(
info_dict['id'], subtitles, automatic_captions) info_dict['id'], subtitles, automatic_captions)
if info_dict.get('formats') is None: formats = self._get_formats(info_dict)
# There's only one format available
formats = [info_dict]
else:
formats = info_dict['formats']
# or None ensures --clean-infojson removes it # or None ensures --clean-infojson removes it
info_dict['_has_drm'] = any(f.get('has_drm') for f in formats) or None info_dict['_has_drm'] = any(f.get('has_drm') for f in formats) or None
if not self.params.get('allow_unplayable_formats'): if not self.params.get('allow_unplayable_formats'):
formats = [f for f in formats if not f.get('has_drm')] formats = [f for f in formats if not f.get('has_drm')]
if info_dict['_has_drm'] and formats and all(
f.get('acodec') == f.get('vcodec') == 'none' for f in formats): if formats and all(f.get('acodec') == f.get('vcodec') == 'none' for f in formats):
self.report_warning( self.report_warning(
'This video is DRM protected and only images are available for download. ' f'{"This video is DRM protected and " if info_dict["_has_drm"] else ""}'
'Use --list-formats to see them') 'only images are available for download. Use --list-formats to see them'.capitalize())
get_from_start = not info_dict.get('is_live') or bool(self.params.get('live_from_start')) get_from_start = not info_dict.get('is_live') or bool(self.params.get('live_from_start'))
if not get_from_start: if not get_from_start:
@ -2505,9 +2547,6 @@ def sanitize_numeric_fields(info):
'--live-from-start is passed, but there are no formats that can be downloaded from the start. ' '--live-from-start is passed, but there are no formats that can be downloaded from the start. '
'If you want to download from the current time, use --no-live-from-start')) 'If you want to download from the current time, use --no-live-from-start'))
if not formats:
self.raise_no_formats(info_dict)
def is_wellformed(f): def is_wellformed(f):
url = f.get('url') url = f.get('url')
if not url: if not url:
@ -2520,7 +2559,10 @@ def is_wellformed(f):
return True return True
# Filter out malformed formats for better extraction robustness # Filter out malformed formats for better extraction robustness
formats = list(filter(is_wellformed, formats)) formats = list(filter(is_wellformed, formats or []))
if not formats:
self.raise_no_formats(info_dict)
formats_dict = {} formats_dict = {}
@ -2598,7 +2640,7 @@ def is_wellformed(f):
info_dict, _ = self.pre_process(info_dict, 'after_filter') info_dict, _ = self.pre_process(info_dict, 'after_filter')
# The pre-processors may have modified the formats # The pre-processors may have modified the formats
formats = info_dict.get('formats', [info_dict]) formats = self._get_formats(info_dict)
list_only = self.params.get('simulate') is None and ( list_only = self.params.get('simulate') is None and (
self.params.get('list_thumbnails') or self.params.get('listformats') or self.params.get('listsubtitles')) self.params.get('list_thumbnails') or self.params.get('listformats') or self.params.get('listsubtitles'))
@ -2656,31 +2698,30 @@ def is_wellformed(f):
# Process what we can, even without any available formats. # Process what we can, even without any available formats.
formats_to_download = [{}] formats_to_download = [{}]
requested_ranges = self.params.get('download_ranges') requested_ranges = tuple(self.params.get('download_ranges', lambda *_: [{}])(info_dict, self))
if requested_ranges:
requested_ranges = tuple(requested_ranges(info_dict, self))
best_format, downloaded_formats = formats_to_download[-1], [] best_format, downloaded_formats = formats_to_download[-1], []
if download: if download:
if best_format: if best_format and requested_ranges:
def to_screen(*msg): def to_screen(*msg):
self.to_screen(f'[info] {info_dict["id"]}: {" ".join(", ".join(variadic(m)) for m in msg)}') self.to_screen(f'[info] {info_dict["id"]}: {" ".join(", ".join(variadic(m)) for m in msg)}')
to_screen(f'Downloading {len(formats_to_download)} format(s):', to_screen(f'Downloading {len(formats_to_download)} format(s):',
(f['format_id'] for f in formats_to_download)) (f['format_id'] for f in formats_to_download))
if requested_ranges: if requested_ranges != ({}, ):
to_screen(f'Downloading {len(requested_ranges)} time ranges:', to_screen(f'Downloading {len(requested_ranges)} time ranges:',
(f'{int(c["start_time"])}-{int(c["end_time"])}' for c in requested_ranges)) (f'{c["start_time"]:.1f}-{c["end_time"]:.1f}' for c in requested_ranges))
max_downloads_reached = False max_downloads_reached = False
for fmt, chapter in itertools.product(formats_to_download, requested_ranges or [{}]): for fmt, chapter in itertools.product(formats_to_download, requested_ranges):
new_info = self._copy_infodict(info_dict) new_info = self._copy_infodict(info_dict)
new_info.update(fmt) new_info.update(fmt)
offset, duration = info_dict.get('section_start') or 0, info_dict.get('duration') or float('inf') offset, duration = info_dict.get('section_start') or 0, info_dict.get('duration') or float('inf')
end_time = offset + min(chapter.get('end_time', duration), duration)
if chapter or offset: if chapter or offset:
new_info.update({ new_info.update({
'section_start': offset + chapter.get('start_time', 0), 'section_start': offset + chapter.get('start_time', 0),
'section_end': offset + min(chapter.get('end_time', duration), duration), # duration may not be accurate. So allow deviations <1sec
'section_end': end_time if end_time <= offset + duration + 1 else None,
'section_title': chapter.get('title'), 'section_title': chapter.get('title'),
'section_number': chapter.get('index'), 'section_number': chapter.get('index'),
}) })
@ -2722,42 +2763,26 @@ def process_subtitles(self, video_id, normal_subtitles, automatic_captions):
if lang not in available_subs: if lang not in available_subs:
available_subs[lang] = cap_info available_subs[lang] = cap_info
if (not self.params.get('writesubtitles') and not if not available_subs or (
self.params.get('writeautomaticsub') or not not self.params.get('writesubtitles')
available_subs): and not self.params.get('writeautomaticsub')):
return None return None
all_sub_langs = tuple(available_subs.keys()) all_sub_langs = tuple(available_subs.keys())
if self.params.get('allsubtitles', False): if self.params.get('allsubtitles', False):
requested_langs = all_sub_langs requested_langs = all_sub_langs
elif self.params.get('subtitleslangs', False): elif self.params.get('subtitleslangs', False):
# A list is used so that the order of languages will be the same as try:
# given in subtitleslangs. See https://github.com/yt-dlp/yt-dlp/issues/1041 requested_langs = orderedSet_from_options(
requested_langs = [] self.params.get('subtitleslangs'), {'all': all_sub_langs}, use_regex=True)
for lang_re in self.params.get('subtitleslangs'): except re.error as e:
discard = lang_re[0] == '-' raise ValueError(f'Wrong regex for subtitlelangs: {e.pattern}')
if discard:
lang_re = lang_re[1:]
if lang_re == 'all':
if discard:
requested_langs = []
else:
requested_langs.extend(all_sub_langs)
continue
current_langs = filter(re.compile(lang_re + '$').match, all_sub_langs)
if discard:
for lang in current_langs:
while lang in requested_langs:
requested_langs.remove(lang)
else:
requested_langs.extend(current_langs)
requested_langs = orderedSet(requested_langs)
elif normal_sub_langs: elif normal_sub_langs:
requested_langs = ['en'] if 'en' in normal_sub_langs else normal_sub_langs[:1] requested_langs = ['en'] if 'en' in normal_sub_langs else normal_sub_langs[:1]
else: else:
requested_langs = ['en'] if 'en' in all_sub_langs else all_sub_langs[:1] requested_langs = ['en'] if 'en' in all_sub_langs else all_sub_langs[:1]
if requested_langs: if requested_langs:
self.write_debug('Downloading subtitles: %s' % ', '.join(requested_langs)) self.to_screen(f'[info] {video_id}: Downloading subtitles: {", ".join(requested_langs)}')
formats_query = self.params.get('subtitlesformat', 'best') formats_query = self.params.get('subtitlesformat', 'best')
formats_preference = formats_query.split('/') if formats_query else [] formats_preference = formats_query.split('/') if formats_query else []
@ -2793,13 +2818,17 @@ def _forceprint(self, key, info_dict):
info_copy['automatic_captions_table'] = self.render_subtitles_table(info_dict.get('id'), info_dict.get('automatic_captions')) info_copy['automatic_captions_table'] = self.render_subtitles_table(info_dict.get('id'), info_dict.get('automatic_captions'))
def format_tmpl(tmpl): def format_tmpl(tmpl):
mobj = re.match(r'\w+(=?)$', tmpl) mobj = re.fullmatch(r'([\w.:,]|-\d|(?P<dict>{([\w.:,]|-\d)+}))+=?', tmpl)
if mobj and mobj.group(1): if not mobj:
return f'{tmpl[:-1]} = %({tmpl[:-1]})r'
elif mobj:
return f'%({tmpl})s'
return tmpl return tmpl
fmt = '%({})s'
if tmpl.startswith('{'):
tmpl = f'.{tmpl}'
if tmpl.endswith('='):
tmpl, fmt = tmpl[:-1], '{0} = %({0})#j'
return '\n'.join(map(fmt.format, [tmpl] if mobj.group('dict') else tmpl.split(',')))
for tmpl in self.params['forceprint'].get(key, []): for tmpl in self.params['forceprint'].get(key, []):
self.to_stdout(self.evaluate_outtmpl(format_tmpl(tmpl), info_copy)) self.to_stdout(self.evaluate_outtmpl(format_tmpl(tmpl), info_copy))
@ -3265,6 +3294,7 @@ def wrapper(*args, **kwargs):
self.to_screen(f'[info] {e}') self.to_screen(f'[info] {e}')
if not self.params.get('break_per_url'): if not self.params.get('break_per_url'):
raise raise
self._num_downloads = 0
else: else:
if self.params.get('dump_single_json', False): if self.params.get('dump_single_json', False):
self.post_extract(res) self.post_extract(res)
@ -3313,6 +3343,12 @@ def sanitize_info(info_dict, remove_private_keys=False):
return info_dict return info_dict
info_dict.setdefault('epoch', int(time.time())) info_dict.setdefault('epoch', int(time.time()))
info_dict.setdefault('_type', 'video') info_dict.setdefault('_type', 'video')
info_dict.setdefault('_version', {
'version': __version__,
'current_git_head': current_git_head(),
'release_git_head': RELEASE_GIT_HEAD,
'repository': REPOSITORY,
})
if remove_private_keys: if remove_private_keys:
reject = lambda k, v: v is None or k.startswith('__') or k in { reject = lambda k, v: v is None or k.startswith('__') or k in {
@ -3433,12 +3469,11 @@ def _make_archive_id(self, info_dict):
return make_archive_id(extractor, video_id) return make_archive_id(extractor, video_id)
def in_download_archive(self, info_dict): def in_download_archive(self, info_dict):
fn = self.params.get('download_archive') if not self.archive:
if fn is None:
return False return False
vid_ids = [self._make_archive_id(info_dict)] vid_ids = [self._make_archive_id(info_dict)]
vid_ids.extend(info_dict.get('_old_archive_ids', [])) vid_ids.extend(info_dict.get('_old_archive_ids') or [])
return any(id_ in self.archive for id_ in vid_ids) return any(id_ in self.archive for id_ in vid_ids)
def record_download_archive(self, info_dict): def record_download_archive(self, info_dict):
@ -3447,7 +3482,9 @@ def record_download_archive(self, info_dict):
return return
vid_id = self._make_archive_id(info_dict) vid_id = self._make_archive_id(info_dict)
assert vid_id assert vid_id
self.write_debug(f'Adding to archive: {vid_id}') self.write_debug(f'Adding to archive: {vid_id}')
if is_path_like(fn):
with locked_file(fn, 'a', encoding='utf-8') as archive_file: with locked_file(fn, 'a', encoding='utf-8') as archive_file:
archive_file.write(vid_id + '\n') archive_file.write(vid_id + '\n')
self.archive.add(vid_id) self.archive.add(vid_id)
@ -3531,11 +3568,17 @@ def _format_note(self, fdict):
res += '~' + format_bytes(fdict['filesize_approx']) res += '~' + format_bytes(fdict['filesize_approx'])
return res return res
def render_formats_table(self, info_dict): def _get_formats(self, info_dict):
if not info_dict.get('formats') and not info_dict.get('url'): if info_dict.get('formats') is None:
return None if info_dict.get('url') and info_dict.get('_type', 'video') == 'video':
return [info_dict]
return []
return info_dict['formats']
formats = info_dict.get('formats', [info_dict]) def render_formats_table(self, info_dict):
formats = self._get_formats(info_dict)
if not formats:
return
if not self.params.get('listformats_table', True) is not False: if not self.params.get('listformats_table', True) is not False:
table = [ table = [
[ [
@ -3543,7 +3586,7 @@ def render_formats_table(self, info_dict):
format_field(f, 'ext'), format_field(f, 'ext'),
self.format_resolution(f), self.format_resolution(f),
self._format_note(f) self._format_note(f)
] for f in formats if f.get('preference') is None or f['preference'] >= -1000] ] for f in formats if (f.get('preference') or 0) >= -1000]
return render_table(['format code', 'extension', 'resolution', 'note'], table, extra_gap=1) return render_table(['format code', 'extension', 'resolution', 'note'], table, extra_gap=1)
def simplified_codec(f, field): def simplified_codec(f, field):
@ -3569,6 +3612,7 @@ def simplified_codec(f, field):
format_field(f, func=self.format_resolution, ignore=('audio only', 'images')), format_field(f, func=self.format_resolution, ignore=('audio only', 'images')),
format_field(f, 'fps', '\t%d', func=round), format_field(f, 'fps', '\t%d', func=round),
format_field(f, 'dynamic_range', '%s', ignore=(None, 'SDR')).replace('HDR', ''), format_field(f, 'dynamic_range', '%s', ignore=(None, 'SDR')).replace('HDR', ''),
format_field(f, 'audio_channels', '\t%s'),
delim, delim,
format_field(f, 'filesize', ' \t%s', func=format_bytes) + format_field(f, 'filesize_approx', '~\t%s', func=format_bytes), format_field(f, 'filesize', ' \t%s', func=format_bytes) + format_field(f, 'filesize_approx', '~\t%s', func=format_bytes),
format_field(f, 'tbr', '\t%dk', func=round), format_field(f, 'tbr', '\t%dk', func=round),
@ -3588,7 +3632,7 @@ def simplified_codec(f, field):
delim=' '), delim=' '),
] for f in formats if f.get('preference') is None or f['preference'] >= -1000] ] for f in formats if f.get('preference') is None or f['preference'] >= -1000]
header_line = self._list_format_headers( header_line = self._list_format_headers(
'ID', 'EXT', 'RESOLUTION', '\tFPS', 'HDR', delim, '\tFILESIZE', '\tTBR', 'PROTO', 'ID', 'EXT', 'RESOLUTION', '\tFPS', 'HDR', 'CH', delim, '\tFILESIZE', '\tTBR', 'PROTO',
delim, 'VCODEC', '\tVBR', 'ACODEC', '\tABR', '\tASR', 'MORE INFO') delim, 'VCODEC', '\tVBR', 'ACODEC', '\tABR', '\tASR', 'MORE INFO')
return render_table( return render_table(
@ -3601,7 +3645,7 @@ def render_thumbnails_table(self, info_dict):
return None return None
return render_table( return render_table(
self._list_format_headers('ID', 'Width', 'Height', 'URL'), self._list_format_headers('ID', 'Width', 'Height', 'URL'),
[[t.get('id'), t.get('width', 'unknown'), t.get('height', 'unknown'), t['url']] for t in thumbnails]) [[t.get('id'), t.get('width') or 'unknown', t.get('height') or 'unknown', t['url']] for t in thumbnails])
def render_subtitles_table(self, video_id, subtitles): def render_subtitles_table(self, video_id, subtitles):
def _row(lang, formats): def _row(lang, formats):
@ -3644,6 +3688,8 @@ def print_debug_header(self):
if not self.params.get('verbose'): if not self.params.get('verbose'):
return return
from . import _IN_CLI # Must be delayed import
# These imports can be slow. So import them only as needed # These imports can be slow. So import them only as needed
from .extractor.extractors import _LAZY_LOADER from .extractor.extractors import _LAZY_LOADER
from .extractor.extractors import _PLUGIN_CLASSES as plugin_extractors from .extractor.extractors import _PLUGIN_CLASSES as plugin_extractors
@ -3673,10 +3719,14 @@ def get_encoding(stream):
write_debug = lambda msg: self._write_string(f'[debug] {msg}\n') write_debug = lambda msg: self._write_string(f'[debug] {msg}\n')
source = detect_variant() source = detect_variant()
if VARIANT not in (None, 'pip'):
source += '*'
write_debug(join_nonempty( write_debug(join_nonempty(
'yt-dlp version', __version__, f'{"yt-dlp" if REPOSITORY == "yt-dlp/yt-dlp" else REPOSITORY} version',
__version__,
f'[{RELEASE_GIT_HEAD}]' if RELEASE_GIT_HEAD else '', f'[{RELEASE_GIT_HEAD}]' if RELEASE_GIT_HEAD else '',
'' if source == 'unknown' else f'({source})', '' if source == 'unknown' else f'({source})',
'' if _IN_CLI else 'API',
delim=' ')) delim=' '))
if not _LAZY_LOADER: if not _LAZY_LOADER:
if os.environ.get('YTDLP_NO_LAZY_EXTRACTORS'): if os.environ.get('YTDLP_NO_LAZY_EXTRACTORS'):
@ -3690,18 +3740,8 @@ def get_encoding(stream):
if self.params['compat_opts']: if self.params['compat_opts']:
write_debug('Compatibility options: %s' % ', '.join(self.params['compat_opts'])) write_debug('Compatibility options: %s' % ', '.join(self.params['compat_opts']))
if source == 'source': if current_git_head():
try: write_debug(f'Git HEAD: {current_git_head()}')
stdout, _, _ = Popen.run(
['git', 'rev-parse', '--short', 'HEAD'],
text=True, cwd=os.path.dirname(os.path.abspath(__file__)),
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if re.fullmatch('[0-9a-f]+', stdout.strip()):
write_debug(f'Git HEAD: {stdout.strip()}')
except Exception:
with contextlib.suppress(Exception):
sys.exc_clear()
write_debug(system_identifier()) write_debug(system_identifier())
exe_versions, ffmpeg_features = FFmpegPostProcessor.get_versions_and_features(self) exe_versions, ffmpeg_features = FFmpegPostProcessor.get_versions_and_features(self)

View File

@ -63,6 +63,8 @@
) )
from .YoutubeDL import YoutubeDL from .YoutubeDL import YoutubeDL
_IN_CLI = False
def _exit(status=0, *args): def _exit(status=0, *args):
for msg in args: for msg in args:
@ -324,14 +326,15 @@ def validate_outtmpl(tmpl, msg):
def parse_chapters(name, value): def parse_chapters(name, value):
chapters, ranges = [], [] chapters, ranges = [], []
parse_timestamp = lambda x: float('inf') if x in ('inf', 'infinite') else parse_duration(x)
for regex in value or []: for regex in value or []:
if regex.startswith('*'): if regex.startswith('*'):
for range in regex[1:].split(','): for range_ in map(str.strip, regex[1:].split(',')):
dur = tuple(map(parse_duration, range.strip().split('-'))) mobj = range_ != '-' and re.fullmatch(r'([^-]+)?\s*-\s*([^-]+)?', range_)
if len(dur) == 2 and all(t is not None for t in dur): dur = mobj and (parse_timestamp(mobj.group(1) or '0'), parse_timestamp(mobj.group(2) or 'inf'))
ranges.append(dur) if None in (dur or [None]):
else:
raise ValueError(f'invalid {name} time range "{regex}". Must be of the form *start-end') raise ValueError(f'invalid {name} time range "{regex}". Must be of the form *start-end')
ranges.append(dur)
continue continue
try: try:
chapters.append(re.compile(regex)) chapters.append(re.compile(regex))
@ -344,10 +347,16 @@ def parse_chapters(name, value):
# Cookies from browser # Cookies from browser
if opts.cookiesfrombrowser: if opts.cookiesfrombrowser:
mobj = re.match(r'(?P<name>[^+:]+)(\s*\+\s*(?P<keyring>[^:]+))?(\s*:(?P<profile>.+))?', opts.cookiesfrombrowser) container = None
mobj = re.fullmatch(r'''(?x)
(?P<name>[^+:]+)
(?:\s*\+\s*(?P<keyring>[^:]+))?
(?:\s*:\s*(?P<profile>.+?))?
(?:\s*::\s*(?P<container>.+))?
''', opts.cookiesfrombrowser)
if mobj is None: if mobj is None:
raise ValueError(f'invalid cookies from browser arguments: {opts.cookiesfrombrowser}') raise ValueError(f'invalid cookies from browser arguments: {opts.cookiesfrombrowser}')
browser_name, keyring, profile = mobj.group('name', 'keyring', 'profile') browser_name, keyring, profile, container = mobj.group('name', 'keyring', 'profile', 'container')
browser_name = browser_name.lower() browser_name = browser_name.lower()
if browser_name not in SUPPORTED_BROWSERS: if browser_name not in SUPPORTED_BROWSERS:
raise ValueError(f'unsupported browser specified for cookies: "{browser_name}". ' raise ValueError(f'unsupported browser specified for cookies: "{browser_name}". '
@ -357,7 +366,7 @@ def parse_chapters(name, value):
if keyring not in SUPPORTED_KEYRINGS: if keyring not in SUPPORTED_KEYRINGS:
raise ValueError(f'unsupported keyring specified for cookies: "{keyring}". ' raise ValueError(f'unsupported keyring specified for cookies: "{keyring}". '
f'Supported keyrings are: {", ".join(sorted(SUPPORTED_KEYRINGS))}') f'Supported keyrings are: {", ".join(sorted(SUPPORTED_KEYRINGS))}')
opts.cookiesfrombrowser = (browser_name, profile, keyring) opts.cookiesfrombrowser = (browser_name, profile, keyring, container)
# MetadataParser # MetadataParser
def metadataparser_actions(f): def metadataparser_actions(f):
@ -402,6 +411,9 @@ def metadataparser_actions(f):
if opts.download_archive is not None: if opts.download_archive is not None:
opts.download_archive = expand_path(opts.download_archive) opts.download_archive = expand_path(opts.download_archive)
if opts.ffmpeg_location is not None:
opts.ffmpeg_location = expand_path(opts.ffmpeg_location)
if opts.user_agent is not None: if opts.user_agent is not None:
opts.headers.setdefault('User-Agent', opts.user_agent) opts.headers.setdefault('User-Agent', opts.user_agent)
if opts.referer is not None: if opts.referer is not None:
@ -477,7 +489,7 @@ def report_conflict(arg1, opt1, arg2='--allow-unplayable-formats', opt2='allow_u
val1=opts.sponskrub and opts.sponskrub_cut) val1=opts.sponskrub and opts.sponskrub_cut)
# Conflicts with --allow-unplayable-formats # Conflicts with --allow-unplayable-formats
report_conflict('--add-metadata', 'addmetadata') report_conflict('--embed-metadata', 'addmetadata')
report_conflict('--embed-chapters', 'addchapters') report_conflict('--embed-chapters', 'addchapters')
report_conflict('--embed-info-json', 'embed_infojson') report_conflict('--embed-info-json', 'embed_infojson')
report_conflict('--embed-subs', 'embedsubtitles') report_conflict('--embed-subs', 'embedsubtitles')
@ -766,6 +778,7 @@ def parse_options(argv=None):
'windowsfilenames': opts.windowsfilenames, 'windowsfilenames': opts.windowsfilenames,
'ignoreerrors': opts.ignoreerrors, 'ignoreerrors': opts.ignoreerrors,
'force_generic_extractor': opts.force_generic_extractor, 'force_generic_extractor': opts.force_generic_extractor,
'allowed_extractors': opts.allowed_extractors or ['default'],
'ratelimit': opts.ratelimit, 'ratelimit': opts.ratelimit,
'throttledratelimit': opts.throttledratelimit, 'throttledratelimit': opts.throttledratelimit,
'overwrites': opts.overwrites, 'overwrites': opts.overwrites,
@ -949,6 +962,8 @@ def _real_main(argv=None):
def main(argv=None): def main(argv=None):
global _IN_CLI
_IN_CLI = True
try: try:
_exit(*variadic(_real_main(argv))) _exit(*variadic(_real_main(argv)))
except DownloadError: except DownloadError:

View File

@ -6,7 +6,8 @@
import shutil import shutil
import traceback import traceback
from .utils import expand_path, write_json_file from .utils import expand_path, traverse_obj, version_tuple, write_json_file
from .version import __version__
class Cache: class Cache:
@ -45,12 +46,20 @@ def store(self, section, key, data, dtype='json'):
if ose.errno != errno.EEXIST: if ose.errno != errno.EEXIST:
raise raise
self._ydl.write_debug(f'Saving {section}.{key} to cache') self._ydl.write_debug(f'Saving {section}.{key} to cache')
write_json_file(data, fn) write_json_file({'yt-dlp_version': __version__, 'data': data}, fn)
except Exception: except Exception:
tb = traceback.format_exc() tb = traceback.format_exc()
self._ydl.report_warning(f'Writing cache to {fn!r} failed: {tb}') self._ydl.report_warning(f'Writing cache to {fn!r} failed: {tb}')
def load(self, section, key, dtype='json', default=None): def _validate(self, data, min_ver):
version = traverse_obj(data, 'yt-dlp_version')
if not version: # Backward compatibility
data, version = {'data': data}, '2022.08.19'
if not min_ver or version_tuple(version) >= version_tuple(min_ver):
return data['data']
self._ydl.write_debug(f'Discarding old cache from version {version} (needs {min_ver})')
def load(self, section, key, dtype='json', default=None, *, min_ver=None):
assert dtype in ('json',) assert dtype in ('json',)
if not self.enabled: if not self.enabled:
@ -61,8 +70,8 @@ def load(self, section, key, dtype='json', default=None):
try: try:
with open(cache_fn, encoding='utf-8') as cachef: with open(cache_fn, encoding='utf-8') as cachef:
self._ydl.write_debug(f'Loading {section}.{key} from cache') self._ydl.write_debug(f'Loading {section}.{key} from cache')
return json.load(cachef) return self._validate(json.load(cachef), min_ver)
except ValueError: except (ValueError, KeyError):
try: try:
file_size = os.path.getsize(cache_fn) file_size = os.path.getsize(cache_fn)
except OSError as oe: except OSError as oe:

View File

@ -1,8 +1,10 @@
import base64 import base64
import contextlib import contextlib
import http.cookiejar import http.cookiejar
import http.cookies
import json import json
import os import os
import re
import shutil import shutil
import struct import struct
import subprocess import subprocess
@ -24,7 +26,14 @@
sqlite3, sqlite3,
) )
from .minicurses import MultilinePrinter, QuietMultilinePrinter from .minicurses import MultilinePrinter, QuietMultilinePrinter
from .utils import Popen, YoutubeDLCookieJar, error_to_str, expand_path from .utils import (
Popen,
YoutubeDLCookieJar,
error_to_str,
expand_path,
is_path_like,
try_call,
)
CHROMIUM_BASED_BROWSERS = {'brave', 'chrome', 'chromium', 'edge', 'opera', 'vivaldi'} CHROMIUM_BASED_BROWSERS = {'brave', 'chrome', 'chromium', 'edge', 'opera', 'vivaldi'}
SUPPORTED_BROWSERS = CHROMIUM_BASED_BROWSERS | {'firefox', 'safari'} SUPPORTED_BROWSERS = CHROMIUM_BASED_BROWSERS | {'firefox', 'safari'}
@ -85,11 +94,12 @@ def _create_progress_bar(logger):
def load_cookies(cookie_file, browser_specification, ydl): def load_cookies(cookie_file, browser_specification, ydl):
cookie_jars = [] cookie_jars = []
if browser_specification is not None: if browser_specification is not None:
browser_name, profile, keyring = _parse_browser_specification(*browser_specification) browser_name, profile, keyring, container = _parse_browser_specification(*browser_specification)
cookie_jars.append(extract_cookies_from_browser(browser_name, profile, YDLLogger(ydl), keyring=keyring)) cookie_jars.append(
extract_cookies_from_browser(browser_name, profile, YDLLogger(ydl), keyring=keyring, container=container))
if cookie_file is not None: if cookie_file is not None:
is_filename = YoutubeDLCookieJar.is_path(cookie_file) is_filename = is_path_like(cookie_file)
if is_filename: if is_filename:
cookie_file = expand_path(cookie_file) cookie_file = expand_path(cookie_file)
@ -101,9 +111,9 @@ def load_cookies(cookie_file, browser_specification, ydl):
return _merge_cookie_jars(cookie_jars) return _merge_cookie_jars(cookie_jars)
def extract_cookies_from_browser(browser_name, profile=None, logger=YDLLogger(), *, keyring=None): def extract_cookies_from_browser(browser_name, profile=None, logger=YDLLogger(), *, keyring=None, container=None):
if browser_name == 'firefox': if browser_name == 'firefox':
return _extract_firefox_cookies(profile, logger) return _extract_firefox_cookies(profile, container, logger)
elif browser_name == 'safari': elif browser_name == 'safari':
return _extract_safari_cookies(profile, logger) return _extract_safari_cookies(profile, logger)
elif browser_name in CHROMIUM_BASED_BROWSERS: elif browser_name in CHROMIUM_BASED_BROWSERS:
@ -112,7 +122,7 @@ def extract_cookies_from_browser(browser_name, profile=None, logger=YDLLogger(),
raise ValueError(f'unknown browser: {browser_name}') raise ValueError(f'unknown browser: {browser_name}')
def _extract_firefox_cookies(profile, logger): def _extract_firefox_cookies(profile, container, logger):
logger.info('Extracting cookies from firefox') logger.info('Extracting cookies from firefox')
if not sqlite3: if not sqlite3:
logger.warning('Cannot extract cookies from firefox without sqlite3 support. ' logger.warning('Cannot extract cookies from firefox without sqlite3 support. '
@ -131,10 +141,35 @@ def _extract_firefox_cookies(profile, logger):
raise FileNotFoundError(f'could not find firefox cookies database in {search_root}') raise FileNotFoundError(f'could not find firefox cookies database in {search_root}')
logger.debug(f'Extracting cookies from: "{cookie_database_path}"') logger.debug(f'Extracting cookies from: "{cookie_database_path}"')
container_id = None
if container not in (None, 'none'):
containers_path = os.path.join(os.path.dirname(cookie_database_path), 'containers.json')
if not os.path.isfile(containers_path) or not os.access(containers_path, os.R_OK):
raise FileNotFoundError(f'could not read containers.json in {search_root}')
with open(containers_path) as containers:
identities = json.load(containers).get('identities', [])
container_id = next((context.get('userContextId') for context in identities if container in (
context.get('name'),
try_call(lambda: re.fullmatch(r'userContext([^\.]+)\.label', context['l10nID']).group())
)), None)
if not isinstance(container_id, int):
raise ValueError(f'could not find firefox container "{container}" in containers.json')
with tempfile.TemporaryDirectory(prefix='yt_dlp') as tmpdir: with tempfile.TemporaryDirectory(prefix='yt_dlp') as tmpdir:
cursor = None cursor = None
try: try:
cursor = _open_database_copy(cookie_database_path, tmpdir) cursor = _open_database_copy(cookie_database_path, tmpdir)
if isinstance(container_id, int):
logger.debug(
f'Only loading cookies from firefox container "{container}", ID {container_id}')
cursor.execute(
'SELECT host, name, value, path, expiry, isSecure FROM moz_cookies WHERE originAttributes LIKE ? OR originAttributes LIKE ?',
(f'%userContextId={container_id}', f'%userContextId={container_id}&%'))
elif container == 'none':
logger.debug('Only loading cookies not belonging to any container')
cursor.execute(
'SELECT host, name, value, path, expiry, isSecure FROM moz_cookies WHERE NOT INSTR(originAttributes,"userContextId=")')
else:
cursor.execute('SELECT host, name, value, path, expiry, isSecure FROM moz_cookies') cursor.execute('SELECT host, name, value, path, expiry, isSecure FROM moz_cookies')
jar = YoutubeDLCookieJar() jar = YoutubeDLCookieJar()
with _create_progress_bar(logger) as progress_bar: with _create_progress_bar(logger) as progress_bar:
@ -810,12 +845,15 @@ def _get_linux_keyring_password(browser_keyring_name, keyring, logger):
def _get_mac_keyring_password(browser_keyring_name, logger): def _get_mac_keyring_password(browser_keyring_name, logger):
logger.debug('using find-generic-password to obtain password from OSX keychain') logger.debug('using find-generic-password to obtain password from OSX keychain')
try: try:
stdout, _, _ = Popen.run( stdout, _, returncode = Popen.run(
['security', 'find-generic-password', ['security', 'find-generic-password',
'-w', # write password to stdout '-w', # write password to stdout
'-a', browser_keyring_name, # match 'account' '-a', browser_keyring_name, # match 'account'
'-s', f'{browser_keyring_name} Safe Storage'], # match 'service' '-s', f'{browser_keyring_name} Safe Storage'], # match 'service'
stdout=subprocess.PIPE, stderr=subprocess.DEVNULL) stdout=subprocess.PIPE, stderr=subprocess.DEVNULL)
if returncode:
logger.warning('find-generic-password failed')
return None
return stdout.rstrip(b'\n') return stdout.rstrip(b'\n')
except Exception as e: except Exception as e:
logger.warning(f'exception running find-generic-password: {error_to_str(e)}') logger.warning(f'exception running find-generic-password: {error_to_str(e)}')
@ -948,11 +986,102 @@ def _is_path(value):
return os.path.sep in value return os.path.sep in value
def _parse_browser_specification(browser_name, profile=None, keyring=None): def _parse_browser_specification(browser_name, profile=None, keyring=None, container=None):
if browser_name not in SUPPORTED_BROWSERS: if browser_name not in SUPPORTED_BROWSERS:
raise ValueError(f'unsupported browser: "{browser_name}"') raise ValueError(f'unsupported browser: "{browser_name}"')
if keyring not in (None, *SUPPORTED_KEYRINGS): if keyring not in (None, *SUPPORTED_KEYRINGS):
raise ValueError(f'unsupported keyring: "{keyring}"') raise ValueError(f'unsupported keyring: "{keyring}"')
if profile is not None and _is_path(profile): if profile is not None and _is_path(expand_path(profile)):
profile = os.path.expanduser(profile) profile = expand_path(profile)
return browser_name, profile, keyring return browser_name, profile, keyring, container
class LenientSimpleCookie(http.cookies.SimpleCookie):
"""More lenient version of http.cookies.SimpleCookie"""
# From https://github.com/python/cpython/blob/v3.10.7/Lib/http/cookies.py
# We use Morsel's legal key chars to avoid errors on setting values
_LEGAL_KEY_CHARS = r'\w\d' + re.escape('!#$%&\'*+-.:^_`|~')
_LEGAL_VALUE_CHARS = _LEGAL_KEY_CHARS + re.escape('(),/<=>?@[]{}')
_RESERVED = {
"expires",
"path",
"comment",
"domain",
"max-age",
"secure",
"httponly",
"version",
"samesite",
}
_FLAGS = {"secure", "httponly"}
# Added 'bad' group to catch the remaining value
_COOKIE_PATTERN = re.compile(r"""
\s* # Optional whitespace at start of cookie
(?P<key> # Start of group 'key'
[""" + _LEGAL_KEY_CHARS + r"""]+?# Any word of at least one letter
) # End of group 'key'
( # Optional group: there may not be a value.
\s*=\s* # Equal Sign
( # Start of potential value
(?P<val> # Start of group 'val'
"(?:[^\\"]|\\.)*" # Any doublequoted string
| # or
\w{3},\s[\w\d\s-]{9,11}\s[\d:]{8}\sGMT # Special case for "expires" attr
| # or
[""" + _LEGAL_VALUE_CHARS + r"""]* # Any word or empty string
) # End of group 'val'
| # or
(?P<bad>(?:\\;|[^;])*?) # 'bad' group fallback for invalid values
) # End of potential value
)? # End of optional value group
\s* # Any number of spaces.
(\s+|;|$) # Ending either at space, semicolon, or EOS.
""", re.ASCII | re.VERBOSE)
def load(self, data):
# Workaround for https://github.com/yt-dlp/yt-dlp/issues/4776
if not isinstance(data, str):
return super().load(data)
morsel = None
for match in self._COOKIE_PATTERN.finditer(data):
if match.group('bad'):
morsel = None
continue
key, value = match.group('key', 'val')
is_attribute = False
if key.startswith('$'):
key = key[1:]
is_attribute = True
lower_key = key.lower()
if lower_key in self._RESERVED:
if morsel is None:
continue
if value is None:
if lower_key not in self._FLAGS:
morsel = None
continue
value = True
else:
value, _ = self.value_decode(value)
morsel[key] = value
elif is_attribute:
morsel = None
elif value is not None:
morsel = self.get(key, http.cookies.Morsel())
real_value, coded_value = self.value_decode(value)
morsel.set(key, real_value, coded_value)
self[key] = morsel
else:
morsel = None

View File

@ -24,6 +24,7 @@
encodeFilename, encodeFilename,
format_bytes, format_bytes,
join_nonempty, join_nonempty,
remove_start,
sanitize_open, sanitize_open,
shell_quote, shell_quote,
timeconvert, timeconvert,
@ -92,6 +93,7 @@ def _set_ydl(self, ydl):
for func in ( for func in (
'deprecation_warning', 'deprecation_warning',
'deprecated_feature',
'report_error', 'report_error',
'report_file_already_downloaded', 'report_file_already_downloaded',
'report_warning', 'report_warning',
@ -119,11 +121,11 @@ def format_seconds(seconds):
time = timetuple_from_msec(seconds * 1000) time = timetuple_from_msec(seconds * 1000)
if time.hours > 99: if time.hours > 99:
return '--:--:--' return '--:--:--'
if not time.hours:
return '%02d:%02d' % time[1:-1]
return '%02d:%02d:%02d' % time[:-1] return '%02d:%02d:%02d' % time[:-1]
format_eta = format_seconds @classmethod
def format_eta(cls, seconds):
return f'{remove_start(cls.format_seconds(seconds), "00:"):>8s}'
@staticmethod @staticmethod
def calc_percent(byte_counter, data_len): def calc_percent(byte_counter, data_len):
@ -331,6 +333,8 @@ def with_fields(*tups, default=''):
return tmpl return tmpl
return default return default
_format_bytes = lambda k: f'{format_bytes(s.get(k)):>10s}'
if s['status'] == 'finished': if s['status'] == 'finished':
if self.params.get('noprogress'): if self.params.get('noprogress'):
self.to_screen('[download] Download completed') self.to_screen('[download] Download completed')
@ -338,7 +342,7 @@ def with_fields(*tups, default=''):
s.update({ s.update({
'speed': speed, 'speed': speed,
'_speed_str': self.format_speed(speed).strip(), '_speed_str': self.format_speed(speed).strip(),
'_total_bytes_str': format_bytes(s.get('total_bytes')), '_total_bytes_str': _format_bytes('total_bytes'),
'_elapsed_str': self.format_seconds(s.get('elapsed')), '_elapsed_str': self.format_seconds(s.get('elapsed')),
'_percent_str': self.format_percent(100), '_percent_str': self.format_percent(100),
}) })
@ -353,15 +357,15 @@ def with_fields(*tups, default=''):
return return
s.update({ s.update({
'_eta_str': self.format_eta(s.get('eta')), '_eta_str': self.format_eta(s.get('eta')).strip(),
'_speed_str': self.format_speed(s.get('speed')), '_speed_str': self.format_speed(s.get('speed')),
'_percent_str': self.format_percent(try_call( '_percent_str': self.format_percent(try_call(
lambda: 100 * s['downloaded_bytes'] / s['total_bytes'], lambda: 100 * s['downloaded_bytes'] / s['total_bytes'],
lambda: 100 * s['downloaded_bytes'] / s['total_bytes_estimate'], lambda: 100 * s['downloaded_bytes'] / s['total_bytes_estimate'],
lambda: s['downloaded_bytes'] == 0 and 0)), lambda: s['downloaded_bytes'] == 0 and 0)),
'_total_bytes_str': format_bytes(s.get('total_bytes')), '_total_bytes_str': _format_bytes('total_bytes'),
'_total_bytes_estimate_str': format_bytes(s.get('total_bytes_estimate')), '_total_bytes_estimate_str': _format_bytes('total_bytes_estimate'),
'_downloaded_bytes_str': format_bytes(s.get('downloaded_bytes')), '_downloaded_bytes_str': _format_bytes('downloaded_bytes'),
'_elapsed_str': self.format_seconds(s.get('elapsed')), '_elapsed_str': self.format_seconds(s.get('elapsed')),
}) })

View File

@ -51,7 +51,7 @@ def real_download(self, filename, info_dict):
args.append([ctx, fragments_to_download, fmt]) args.append([ctx, fragments_to_download, fmt])
return self.download_and_append_fragments_multiple(*args) return self.download_and_append_fragments_multiple(*args, is_fatal=lambda idx: idx == 0)
def _resolve_fragments(self, fragments, ctx): def _resolve_fragments(self, fragments, ctx):
fragments = fragments(ctx) if callable(fragments) else fragments fragments = fragments(ctx) if callable(fragments) else fragments

View File

@ -252,6 +252,10 @@ def supports_manifest(manifest):
check_results = (not re.search(feature, manifest) for feature in UNSUPPORTED_FEATURES) check_results = (not re.search(feature, manifest) for feature in UNSUPPORTED_FEATURES)
return all(check_results) return all(check_results)
@staticmethod
def _aria2c_filename(fn):
return fn if os.path.isabs(fn) else f'.{os.path.sep}{fn}'
def _make_cmd(self, tmpfilename, info_dict): def _make_cmd(self, tmpfilename, info_dict):
cmd = [self.exe, '-c', cmd = [self.exe, '-c',
'--console-log-level=warn', '--summary-interval=0', '--download-result=hide', '--console-log-level=warn', '--summary-interval=0', '--download-result=hide',
@ -280,11 +284,9 @@ def _make_cmd(self, tmpfilename, info_dict):
# https://github.com/aria2/aria2/issues/1373 # https://github.com/aria2/aria2/issues/1373
dn = os.path.dirname(tmpfilename) dn = os.path.dirname(tmpfilename)
if dn: if dn:
if not os.path.isabs(dn): cmd += ['--dir', self._aria2c_filename(dn) + os.path.sep]
dn = f'.{os.path.sep}{dn}'
cmd += ['--dir', dn + os.path.sep]
if 'fragments' not in info_dict: if 'fragments' not in info_dict:
cmd += ['--out', f'.{os.path.sep}{os.path.basename(tmpfilename)}'] cmd += ['--out', self._aria2c_filename(os.path.basename(tmpfilename))]
cmd += ['--auto-file-renaming=false'] cmd += ['--auto-file-renaming=false']
if 'fragments' in info_dict: if 'fragments' in info_dict:
@ -293,11 +295,11 @@ def _make_cmd(self, tmpfilename, info_dict):
url_list = [] url_list = []
for frag_index, fragment in enumerate(info_dict['fragments']): for frag_index, fragment in enumerate(info_dict['fragments']):
fragment_filename = '%s-Frag%d' % (os.path.basename(tmpfilename), frag_index) fragment_filename = '%s-Frag%d' % (os.path.basename(tmpfilename), frag_index)
url_list.append('%s\n\tout=%s' % (fragment['url'], fragment_filename)) url_list.append('%s\n\tout=%s' % (fragment['url'], self._aria2c_filename(fragment_filename)))
stream, _ = self.sanitize_open(url_list_file, 'wb') stream, _ = self.sanitize_open(url_list_file, 'wb')
stream.write('\n'.join(url_list).encode()) stream.write('\n'.join(url_list).encode())
stream.close() stream.close()
cmd += ['-i', url_list_file] cmd += ['-i', self._aria2c_filename(url_list_file)]
else: else:
cmd += ['--', info_dict['url']] cmd += ['--', info_dict['url']]
return cmd return cmd
@ -515,16 +517,14 @@ class AVconvFD(FFmpegFD):
if name.endswith('FD') and name not in ('ExternalFD', 'FragmentFD') if name.endswith('FD') and name not in ('ExternalFD', 'FragmentFD')
} }
_BY_EXE = {klass.EXE_NAME: klass for klass in _BY_NAME.values()}
def list_external_downloaders(): def list_external_downloaders():
return sorted(_BY_NAME.keys()) return sorted(_BY_NAME.keys())
def get_external_downloader(external_downloader): def get_external_downloader(external_downloader):
""" Given the name of the executable, see whether we support the given """ Given the name of the executable, see whether we support the given downloader """
downloader . """
# Drop .exe extension on Windows
bn = os.path.splitext(os.path.basename(external_downloader))[0] bn = os.path.splitext(os.path.basename(external_downloader))[0]
return _BY_NAME.get(bn, _BY_EXE.get(bn)) return _BY_NAME.get(bn) or next((
klass for klass in _BY_NAME.values() if klass.EXE_NAME in bn
), None)

View File

@ -184,7 +184,7 @@ def build_fragments_list(boot_info):
first_frag_number = fragment_run_entry_table[0]['first'] first_frag_number = fragment_run_entry_table[0]['first']
fragments_counter = itertools.count(first_frag_number) fragments_counter = itertools.count(first_frag_number)
for segment, fragments_count in segment_run_table['segment_run']: for segment, fragments_count in segment_run_table['segment_run']:
# In some live HDS streams (for example Rai), `fragments_count` is # In some live HDS streams (e.g. Rai), `fragments_count` is
# abnormal and causing out-of-memory errors. It's OK to change the # abnormal and causing out-of-memory errors. It's OK to change the
# number of fragments for live streams as they are updated periodically # number of fragments for live streams as they are updated periodically
if fragments_count == 4294967295 and boot_info['live']: if fragments_count == 4294967295 and boot_info['live']:
@ -424,6 +424,4 @@ def real_download(self, filename, info_dict):
msg = 'Missed %d fragments' % (fragments_list[0][1] - (frag_i + 1)) msg = 'Missed %d fragments' % (fragments_list[0][1] - (frag_i + 1))
self.report_warning(msg) self.report_warning(msg)
self._finish_frag_download(ctx, info_dict) return self._finish_frag_download(ctx, info_dict)
return True

View File

@ -65,8 +65,8 @@ class FragmentFD(FileDownloader):
""" """
def report_retry_fragment(self, err, frag_index, count, retries): def report_retry_fragment(self, err, frag_index, count, retries):
self.deprecation_warning( self.deprecation_warning('yt_dlp.downloader.FragmentFD.report_retry_fragment is deprecated. '
'yt_dlp.downloader.FragmentFD.report_retry_fragment is deprecated. Use yt_dlp.downloader.FileDownloader.report_retry instead') 'Use yt_dlp.downloader.FileDownloader.report_retry instead')
return self.report_retry(err, count, retries, frag_index) return self.report_retry(err, count, retries, frag_index)
def report_skip_fragment(self, frag_index, err=None): def report_skip_fragment(self, frag_index, err=None):
@ -295,16 +295,23 @@ def _finish_frag_download(self, ctx, info_dict):
self.try_remove(ytdl_filename) self.try_remove(ytdl_filename)
elapsed = time.time() - ctx['started'] elapsed = time.time() - ctx['started']
if ctx['tmpfilename'] == '-': to_file = ctx['tmpfilename'] != '-'
downloaded_bytes = ctx['complete_frags_downloaded_bytes'] if to_file:
downloaded_bytes = os.path.getsize(encodeFilename(ctx['tmpfilename']))
else: else:
downloaded_bytes = ctx['complete_frags_downloaded_bytes']
if not downloaded_bytes:
if to_file:
self.try_remove(ctx['tmpfilename'])
self.report_error('The downloaded file is empty')
return False
elif to_file:
self.try_rename(ctx['tmpfilename'], ctx['filename']) self.try_rename(ctx['tmpfilename'], ctx['filename'])
if self.params.get('updatetime', True):
filetime = ctx.get('fragment_filetime') filetime = ctx.get('fragment_filetime')
if filetime: if self.params.get('updatetime', True) and filetime:
with contextlib.suppress(Exception): with contextlib.suppress(Exception):
os.utime(ctx['filename'], (time.time(), filetime)) os.utime(ctx['filename'], (time.time(), filetime))
downloaded_bytes = os.path.getsize(encodeFilename(ctx['filename']))
self._hook_progress({ self._hook_progress({
'downloaded_bytes': downloaded_bytes, 'downloaded_bytes': downloaded_bytes,
@ -316,6 +323,7 @@ def _finish_frag_download(self, ctx, info_dict):
'max_progress': ctx.get('max_progress'), 'max_progress': ctx.get('max_progress'),
'progress_idx': ctx.get('progress_idx'), 'progress_idx': ctx.get('progress_idx'),
}, info_dict) }, info_dict)
return True
def _prepare_external_frag_download(self, ctx): def _prepare_external_frag_download(self, ctx):
if 'live' not in ctx: if 'live' not in ctx:
@ -362,7 +370,7 @@ def decrypt_fragment(fragment, frag_content):
return decrypt_fragment return decrypt_fragment
def download_and_append_fragments_multiple(self, *args, pack_func=None, finish_func=None): def download_and_append_fragments_multiple(self, *args, **kwargs):
''' '''
@params (ctx1, fragments1, info_dict1), (ctx2, fragments2, info_dict2), ... @params (ctx1, fragments1, info_dict1), (ctx2, fragments2, info_dict2), ...
all args must be either tuple or list all args must be either tuple or list
@ -370,7 +378,7 @@ def download_and_append_fragments_multiple(self, *args, pack_func=None, finish_f
interrupt_trigger = [True] interrupt_trigger = [True]
max_progress = len(args) max_progress = len(args)
if max_progress == 1: if max_progress == 1:
return self.download_and_append_fragments(*args[0], pack_func=pack_func, finish_func=finish_func) return self.download_and_append_fragments(*args[0], **kwargs)
max_workers = self.params.get('concurrent_fragment_downloads', 1) max_workers = self.params.get('concurrent_fragment_downloads', 1)
if max_progress > 1: if max_progress > 1:
self._prepare_multiline_status(max_progress) self._prepare_multiline_status(max_progress)
@ -380,8 +388,7 @@ def thread_func(idx, ctx, fragments, info_dict, tpe):
ctx['max_progress'] = max_progress ctx['max_progress'] = max_progress
ctx['progress_idx'] = idx ctx['progress_idx'] = idx
return self.download_and_append_fragments( return self.download_and_append_fragments(
ctx, fragments, info_dict, pack_func=pack_func, finish_func=finish_func, ctx, fragments, info_dict, **kwargs, tpe=tpe, interrupt_trigger=interrupt_trigger)
tpe=tpe, interrupt_trigger=interrupt_trigger)
class FTPE(concurrent.futures.ThreadPoolExecutor): class FTPE(concurrent.futures.ThreadPoolExecutor):
# has to stop this or it's going to wait on the worker thread itself # has to stop this or it's going to wait on the worker thread itself
@ -428,17 +435,12 @@ def interrupt_trigger_iter(fg):
return result return result
def download_and_append_fragments( def download_and_append_fragments(
self, ctx, fragments, info_dict, *, pack_func=None, finish_func=None, self, ctx, fragments, info_dict, *, is_fatal=(lambda idx: False),
tpe=None, interrupt_trigger=None): pack_func=(lambda content, idx: content), finish_func=None,
if not interrupt_trigger: tpe=None, interrupt_trigger=(True, )):
interrupt_trigger = (True, )
is_fatal = ( if not self.params.get('skip_unavailable_fragments', True):
((lambda _: False) if info_dict.get('is_live') else (lambda idx: idx == 0)) is_fatal = lambda _: True
if self.params.get('skip_unavailable_fragments', True) else (lambda _: True))
if not pack_func:
pack_func = lambda frag_content, _: frag_content
def download_fragment(fragment, ctx): def download_fragment(fragment, ctx):
if not interrupt_trigger[0]: if not interrupt_trigger[0]:
@ -527,5 +529,4 @@ def _download_fragment(fragment):
if finish_func is not None: if finish_func is not None:
ctx['dest_stream'].write(finish_func()) ctx['dest_stream'].write(finish_func())
ctx['dest_stream'].flush() ctx['dest_stream'].flush()
self._finish_frag_download(ctx, info_dict) return self._finish_frag_download(ctx, info_dict)
return True

View File

@ -138,6 +138,8 @@ def write_piff_header(stream, params):
if fourcc == 'AACL': if fourcc == 'AACL':
sample_entry_box = box(b'mp4a', sample_entry_payload) sample_entry_box = box(b'mp4a', sample_entry_payload)
if fourcc == 'EC-3':
sample_entry_box = box(b'ec-3', sample_entry_payload)
elif stream_type == 'video': elif stream_type == 'video':
sample_entry_payload += u16.pack(0) # pre defined sample_entry_payload += u16.pack(0) # pre defined
sample_entry_payload += u16.pack(0) # reserved sample_entry_payload += u16.pack(0) # reserved
@ -278,5 +280,4 @@ def real_download(self, filename, info_dict):
return False return False
self.report_skip_fragment(frag_index) self.report_skip_fragment(frag_index)
self._finish_frag_download(ctx, info_dict) return self._finish_frag_download(ctx, info_dict)
return True

View File

@ -186,5 +186,4 @@ def real_download(self, filename, info_dict):
ctx['dest_stream'].write( ctx['dest_stream'].write(
b'--%b--\r\n\r\n' % frag_boundary.encode('us-ascii')) b'--%b--\r\n\r\n' % frag_boundary.encode('us-ascii'))
self._finish_frag_download(ctx, info_dict) return self._finish_frag_download(ctx, info_dict)
return True

View File

@ -191,8 +191,7 @@ def download_and_parse_fragment(url, frag_index, request_data=None, headers=None
if test: if test:
break break
self._finish_frag_download(ctx, info_dict) return self._finish_frag_download(ctx, info_dict)
return True
@staticmethod @staticmethod
def parse_live_timestamp(action): def parse_live_timestamp(action):

View File

@ -1,5 +1,29 @@
# flake8: noqa: F401 # flake8: noqa: F401
from .youtube import ( # Youtube is moved to the top to improve performance
YoutubeIE,
YoutubeClipIE,
YoutubeFavouritesIE,
YoutubeNotificationsIE,
YoutubeHistoryIE,
YoutubeTabIE,
YoutubeLivestreamEmbedIE,
YoutubePlaylistIE,
YoutubeRecommendedIE,
YoutubeSearchDateIE,
YoutubeSearchIE,
YoutubeSearchURLIE,
YoutubeMusicSearchURLIE,
YoutubeSubscriptionsIE,
YoutubeStoriesIE,
YoutubeTruncatedIDIE,
YoutubeTruncatedURLIE,
YoutubeYtBeIE,
YoutubeYtUserIE,
YoutubeWatchLaterIE,
YoutubeShortsAudioPivotIE
)
from .abc import ( from .abc import (
ABCIE, ABCIE,
ABCIViewIE, ABCIViewIE,
@ -41,6 +65,7 @@
HistoryPlayerIE, HistoryPlayerIE,
BiographyIE, BiographyIE,
) )
from .aeonco import AeonCoIE
from .afreecatv import ( from .afreecatv import (
AfreecaTVIE, AfreecaTVIE,
AfreecaTVLiveIE, AfreecaTVLiveIE,
@ -61,7 +86,6 @@
AmericasTestKitchenSeasonIE, AmericasTestKitchenSeasonIE,
) )
from .angel import AngelIE from .angel import AngelIE
from .animeondemand import AnimeOnDemandIE
from .anvato import AnvatoIE from .anvato import AnvatoIE
from .aol import AolIE from .aol import AolIE
from .allocine import AllocineIE from .allocine import AllocineIE
@ -149,6 +173,7 @@
from .behindkink import BehindKinkIE from .behindkink import BehindKinkIE
from .bellmedia import BellMediaIE from .bellmedia import BellMediaIE
from .beatport import BeatportIE from .beatport import BeatportIE
from .berufetv import BerufeTVIE
from .bet import BetIE from .bet import BetIE
from .bfi import BFIPlayerIE from .bfi import BFIPlayerIE
from .bfmtv import ( from .bfmtv import (
@ -168,7 +193,9 @@
BilibiliAudioIE, BilibiliAudioIE,
BilibiliAudioAlbumIE, BilibiliAudioAlbumIE,
BiliBiliPlayerIE, BiliBiliPlayerIE,
BilibiliChannelIE, BilibiliSpaceVideoIE,
BilibiliSpaceAudioIE,
BilibiliSpacePlaylistIE,
BiliIntlIE, BiliIntlIE,
BiliIntlSeriesIE, BiliIntlSeriesIE,
BiliLiveIE, BiliLiveIE,
@ -194,6 +221,7 @@
from .bongacams import BongaCamsIE from .bongacams import BongaCamsIE
from .bostonglobe import BostonGlobeIE from .bostonglobe import BostonGlobeIE
from .box import BoxIE from .box import BoxIE
from .booyah import BooyahClipsIE
from .bpb import BpbIE from .bpb import BpbIE
from .br import ( from .br import (
BRIE, BRIE,
@ -207,6 +235,7 @@
BrightcoveNewIE, BrightcoveNewIE,
) )
from .businessinsider import BusinessInsiderIE from .businessinsider import BusinessInsiderIE
from .bundesliga import BundesligaIE
from .buzzfeed import BuzzFeedIE from .buzzfeed import BuzzFeedIE
from .byutv import BYUtvIE from .byutv import BYUtvIE
from .c56 import C56IE from .c56 import C56IE
@ -306,6 +335,7 @@
CNNIE, CNNIE,
CNNBlogsIE, CNNBlogsIE,
CNNArticleIE, CNNArticleIE,
CNNIndonesiaIE,
) )
from .coub import CoubIE from .coub import CoubIE
from .comedycentral import ( from .comedycentral import (
@ -384,7 +414,7 @@
DeezerAlbumIE, DeezerAlbumIE,
) )
from .democracynow import DemocracynowIE from .democracynow import DemocracynowIE
from .detik import Detik20IE from .detik import DetikEmbedIE
from .dfb import DFBIE from .dfb import DFBIE
from .dhm import DHMIE from .dhm import DHMIE
from .digg import DiggIE from .digg import DiggIE
@ -411,6 +441,7 @@
AnimalPlanetIE, AnimalPlanetIE,
TLCIE, TLCIE,
MotorTrendIE, MotorTrendIE,
MotorTrendOnDemandIE,
DiscoveryPlusIndiaIE, DiscoveryPlusIndiaIE,
DiscoveryNetworksDeIE, DiscoveryNetworksDeIE,
DiscoveryPlusItalyIE, DiscoveryPlusItalyIE,
@ -470,6 +501,7 @@
EpiconIE, EpiconIE,
EpiconSeriesIE, EpiconSeriesIE,
) )
from .epoch import EpochIE
from .eporner import EpornerIE from .eporner import EpornerIE
from .eroprofile import ( from .eroprofile import (
EroProfileIE, EroProfileIE,
@ -491,6 +523,7 @@
from .esri import EsriVideoIE from .esri import EsriVideoIE
from .europa import EuropaIE from .europa import EuropaIE
from .europeantour import EuropeanTourIE from .europeantour import EuropeanTourIE
from .eurosport import EurosportIE
from .euscreen import EUScreenIE from .euscreen import EUScreenIE
from .expotv import ExpoTVIE from .expotv import ExpoTVIE
from .expressen import ExpressenIE from .expressen import ExpressenIE
@ -500,6 +533,7 @@
FacebookIE, FacebookIE,
FacebookPluginsVideoIE, FacebookPluginsVideoIE,
FacebookRedirectURLIE, FacebookRedirectURLIE,
FacebookReelIE,
) )
from .fancode import ( from .fancode import (
FancodeVodIE, FancodeVodIE,
@ -622,6 +656,7 @@
) )
from .googlesearch import GoogleSearchIE from .googlesearch import GoogleSearchIE
from .gopro import GoProIE from .gopro import GoProIE
from .goplay import GoPlayIE
from .goshgay import GoshgayIE from .goshgay import GoshgayIE
from .gotostage import GoToStageIE from .gotostage import GoToStageIE
from .gputechconf import GPUTechConfIE from .gputechconf import GPUTechConfIE
@ -631,6 +666,7 @@
GronkhVodsIE GronkhVodsIE
) )
from .groupon import GrouponIE from .groupon import GrouponIE
from .harpodeon import HarpodeonIE
from .hbo import HBOIE from .hbo import HBOIE
from .hearthisat import HearThisAtIE from .hearthisat import HearThisAtIE
from .heise import HeiseIE from .heise import HeiseIE
@ -662,7 +698,10 @@
HSEShowIE, HSEShowIE,
HSEProductIE, HSEProductIE,
) )
from .genericembeds import HTML5MediaEmbedIE from .genericembeds import (
HTML5MediaEmbedIE,
QuotedHTMLIE,
)
from .huajiao import HuajiaoIE from .huajiao import HuajiaoIE
from .huya import HuyaLiveIE from .huya import HuyaLiveIE
from .huffpost import HuffPostIE from .huffpost import HuffPostIE
@ -687,6 +726,7 @@
IHeartRadioIE, IHeartRadioIE,
IHeartRadioPodcastIE, IHeartRadioPodcastIE,
) )
from .iltalehti import IltalehtiIE
from .imdb import ( from .imdb import (
ImdbIE, ImdbIE,
ImdbListIE ImdbListIE
@ -718,6 +758,11 @@
IqIE, IqIE,
IqAlbumIE IqAlbumIE
) )
from .islamchannel import (
IslamChannelIE,
IslamChannelSeriesIE,
)
from .israelnationalnews import IsraelNationalNewsIE
from .itprotv import ( from .itprotv import (
ITProTVIE, ITProTVIE,
ITProTVCourseIE ITProTVCourseIE
@ -907,6 +952,7 @@
MediasiteCatalogIE, MediasiteCatalogIE,
MediasiteNamedCatalogIE, MediasiteNamedCatalogIE,
) )
from .mediaworksnz import MediaWorksNZVODIE
from .medici import MediciIE from .medici import MediciIE
from .megaphone import MegaphoneIE from .megaphone import MegaphoneIE
from .meipai import MeipaiIE from .meipai import MeipaiIE
@ -922,6 +968,7 @@
MicrosoftVirtualAcademyIE, MicrosoftVirtualAcademyIE,
MicrosoftVirtualAcademyCourseIE, MicrosoftVirtualAcademyCourseIE,
) )
from .microsoftembed import MicrosoftEmbedIE
from .mildom import ( from .mildom import (
MildomIE, MildomIE,
MildomVodIE, MildomVodIE,
@ -955,6 +1002,7 @@
from .mlb import ( from .mlb import (
MLBIE, MLBIE,
MLBVideoIE, MLBVideoIE,
MLBTVIE,
) )
from .mlssoccer import MLSSoccerIE from .mlssoccer import MLSSoccerIE
from .mnet import MnetIE from .mnet import MnetIE
@ -973,6 +1021,7 @@
from .motorsport import MotorsportIE from .motorsport import MotorsportIE
from .movieclips import MovieClipsIE from .movieclips import MovieClipsIE
from .moviepilot import MoviepilotIE from .moviepilot import MoviepilotIE
from .moview import MoviewPlayIE
from .moviezine import MoviezineIE from .moviezine import MoviezineIE
from .movingimage import MovingImageIE from .movingimage import MovingImageIE
from .msn import MSNIE from .msn import MSNIE
@ -1041,6 +1090,7 @@
NBCSportsIE, NBCSportsIE,
NBCSportsStreamIE, NBCSportsStreamIE,
NBCSportsVPlayerIE, NBCSportsVPlayerIE,
NBCStationsIE,
) )
from .ndr import ( from .ndr import (
NDRIE, NDRIE,
@ -1075,6 +1125,7 @@
NewgroundsPlaylistIE, NewgroundsPlaylistIE,
NewgroundsUserIE, NewgroundsUserIE,
) )
from .newspicks import NewsPicksIE
from .newstube import NewstubeIE from .newstube import NewstubeIE
from .newsy import NewsyIE from .newsy import NewsyIE
from .nextmedia import ( from .nextmedia import (
@ -1134,6 +1185,7 @@
from .noovo import NoovoIE from .noovo import NoovoIE
from .normalboots import NormalbootsIE from .normalboots import NormalbootsIE
from .nosvideo import NosVideoIE from .nosvideo import NosVideoIE
from .nosnl import NOSNLArticleIE
from .nova import ( from .nova import (
NovaEmbedIE, NovaEmbedIE,
NovaIE, NovaIE,
@ -1188,6 +1240,7 @@
from .on24 import On24IE from .on24 import On24IE
from .ondemandkorea import OnDemandKoreaIE from .ondemandkorea import OnDemandKoreaIE
from .onefootball import OneFootballIE from .onefootball import OneFootballIE
from .onenewsnz import OneNewsNZIE
from .onet import ( from .onet import (
OnetIE, OnetIE,
OnetChannelIE, OnetChannelIE,
@ -1235,7 +1288,7 @@
ParamountPlusIE, ParamountPlusIE,
ParamountPlusSeriesIE, ParamountPlusSeriesIE,
) )
from .parliamentliveuk import ParliamentLiveUKIE from .parler import ParlerIE
from .parlview import ParlviewIE from .parlview import ParlviewIE
from .patreon import ( from .patreon import (
PatreonIE, PatreonIE,
@ -1296,6 +1349,7 @@
PluralsightIE, PluralsightIE,
PluralsightCourseIE, PluralsightCourseIE,
) )
from .podbayfm import PodbayFMIE, PodbayFMChannelIE
from .podchaser import PodchaserIE from .podchaser import PodchaserIE
from .podomatic import PodomaticIE from .podomatic import PodomaticIE
from .pokemon import ( from .pokemon import (
@ -1336,6 +1390,7 @@
PuhuTVIE, PuhuTVIE,
PuhuTVSerieIE, PuhuTVSerieIE,
) )
from .prankcast import PrankCastIE
from .premiershiprugby import PremiershipRugbyIE from .premiershiprugby import PremiershipRugbyIE
from .presstv import PressTVIE from .presstv import PressTVIE
from .projectveritas import ProjectVeritasIE from .projectveritas import ProjectVeritasIE
@ -1406,6 +1461,7 @@
RCTIPlusTVIE, RCTIPlusTVIE,
) )
from .rds import RDSIE from .rds import RDSIE
from .redbee import ParliamentLiveUKIE, RTBFIE
from .redbulltv import ( from .redbulltv import (
RedBullTVIE, RedBullTVIE,
RedBullEmbedIE, RedBullEmbedIE,
@ -1439,7 +1495,6 @@
from .roosterteeth import RoosterTeethIE, RoosterTeethSeriesIE from .roosterteeth import RoosterTeethIE, RoosterTeethSeriesIE
from .rottentomatoes import RottenTomatoesIE from .rottentomatoes import RottenTomatoesIE
from .rozhlas import RozhlasIE from .rozhlas import RozhlasIE
from .rtbf import RTBFIE
from .rte import RteIE, RteRadioIE from .rte import RteIE, RteRadioIE
from .rtlnl import ( from .rtlnl import (
RtlNlIE, RtlNlIE,
@ -1516,6 +1571,7 @@
from .sapo import SapoIE from .sapo import SapoIE
from .savefrom import SaveFromIE from .savefrom import SaveFromIE
from .sbs import SBSIE from .sbs import SBSIE
from .screen9 import Screen9IE
from .screencast import ScreencastIE from .screencast import ScreencastIE
from .screencastomatic import ScreencastOMaticIE from .screencastomatic import ScreencastOMaticIE
from .scrippsnetworks import ( from .scrippsnetworks import (
@ -1581,6 +1637,7 @@
from .slideshare import SlideshareIE from .slideshare import SlideshareIE
from .slideslive import SlidesLiveIE from .slideslive import SlidesLiveIE
from .slutload import SlutloadIE from .slutload import SlutloadIE
from .smotrim import SmotrimIE
from .snotr import SnotrIE from .snotr import SnotrIE
from .sohu import SohuIE from .sohu import SohuIE
from .sonyliv import ( from .sonyliv import (
@ -1724,6 +1781,14 @@
from .teletask import TeleTaskIE from .teletask import TeleTaskIE
from .telewebion import TelewebionIE from .telewebion import TelewebionIE
from .tempo import TempoIE from .tempo import TempoIE
from .tencent import (
IflixEpisodeIE,
IflixSeriesIE,
VQQSeriesIE,
VQQVideoIE,
WeTvEpisodeIE,
WeTvSeriesIE,
)
from .tennistv import TennisTVIE from .tennistv import TennisTVIE
from .tenplay import TenPlayIE from .tenplay import TenPlayIE
from .testurl import TestURLIE from .testurl import TestURLIE
@ -1783,6 +1848,10 @@
from .toutv import TouTvIE from .toutv import TouTvIE
from .toypics import ToypicsUserIE, ToypicsIE from .toypics import ToypicsUserIE, ToypicsIE
from .traileraddict import TrailerAddictIE from .traileraddict import TrailerAddictIE
from .triller import (
TrillerIE,
TrillerUserIE,
)
from .trilulilu import TriluliluIE from .trilulilu import TriluliluIE
from .trovo import ( from .trovo import (
TrovoIE, TrovoIE,
@ -1792,6 +1861,7 @@
) )
from .trueid import TrueIDIE from .trueid import TrueIDIE
from .trunews import TruNewsIE from .trunews import TruNewsIE
from .truth import TruthIE
from .trutv import TruTVIE from .trutv import TruTVIE
from .tube8 import Tube8IE from .tube8 import Tube8IE
from .tubetugraz import TubeTuGrazIE, TubeTuGrazSeriesIE from .tubetugraz import TubeTuGrazIE, TubeTuGrazSeriesIE
@ -1815,6 +1885,9 @@
KatsomoIE, KatsomoIE,
MTVUutisetArticleIE, MTVUutisetArticleIE,
) )
from .tv24ua import (
TV24UAVideoIE,
)
from .tv2dk import ( from .tv2dk import (
TV2DKIE, TV2DKIE,
TV2DKBornholmPlayIE, TV2DKBornholmPlayIE,
@ -1917,6 +1990,7 @@
from .umg import UMGDeIE from .umg import UMGDeIE
from .unistra import UnistraIE from .unistra import UnistraIE
from .unity import UnityIE from .unity import UnityIE
from .unscripted import UnscriptedNewsVideoIE
from .uol import UOLIE from .uol import UOLIE
from .uplynk import ( from .uplynk import (
UplynkIE, UplynkIE,
@ -1974,7 +2048,6 @@
VidioLiveIE VidioLiveIE
) )
from .vidlii import VidLiiIE from .vidlii import VidLiiIE
from .vier import VierIE, VierVideosIE
from .viewlift import ( from .viewlift import (
ViewLiftIE, ViewLiftIE,
ViewLiftEmbedIE, ViewLiftEmbedIE,
@ -2087,7 +2160,6 @@
WeiboMobileIE WeiboMobileIE
) )
from .weiqitv import WeiqiTVIE from .weiqitv import WeiqiTVIE
from .wetv import WeTvEpisodeIE, WeTvSeriesIE
from .wikimedia import WikimediaIE from .wikimedia import WikimediaIE
from .willow import WillowIE from .willow import WillowIE
from .wimtv import WimTVIE from .wimtv import WimTVIE
@ -2095,6 +2167,11 @@
from .wistia import ( from .wistia import (
WistiaIE, WistiaIE,
WistiaPlaylistIE, WistiaPlaylistIE,
WistiaChannelIE,
)
from .wordpress import (
WordpressPlaylistEmbedIE,
WordpressMiniAudioPlayerEmbedIE,
) )
from .worldstarhiphop import WorldStarHipHopIE from .worldstarhiphop import WorldStarHipHopIE
from .wppilot import ( from .wppilot import (
@ -2170,42 +2247,44 @@
from .youporn import YouPornIE from .youporn import YouPornIE
from .yourporn import YourPornIE from .yourporn import YourPornIE
from .yourupload import YourUploadIE from .yourupload import YourUploadIE
from .youtube import (
YoutubeIE,
YoutubeClipIE,
YoutubeFavouritesIE,
YoutubeNotificationsIE,
YoutubeHistoryIE,
YoutubeTabIE,
YoutubeLivestreamEmbedIE,
YoutubePlaylistIE,
YoutubeRecommendedIE,
YoutubeSearchDateIE,
YoutubeSearchIE,
YoutubeSearchURLIE,
YoutubeMusicSearchURLIE,
YoutubeSubscriptionsIE,
YoutubeStoriesIE,
YoutubeTruncatedIDIE,
YoutubeTruncatedURLIE,
YoutubeYtBeIE,
YoutubeYtUserIE,
YoutubeWatchLaterIE,
)
from .zapiks import ZapiksIE from .zapiks import ZapiksIE
from .zattoo import ( from .zattoo import (
BBVTVIE, BBVTVIE,
BBVTVLiveIE,
BBVTVRecordingsIE,
EinsUndEinsTVIE, EinsUndEinsTVIE,
EinsUndEinsTVLiveIE,
EinsUndEinsTVRecordingsIE,
EWETVIE, EWETVIE,
EWETVLiveIE,
EWETVRecordingsIE,
GlattvisionTVIE, GlattvisionTVIE,
GlattvisionTVLiveIE,
GlattvisionTVRecordingsIE,
MNetTVIE, MNetTVIE,
NetPlusIE, MNetTVLiveIE,
MNetTVRecordingsIE,
NetPlusTVIE,
NetPlusTVLiveIE,
NetPlusTVRecordingsIE,
OsnatelTVIE, OsnatelTVIE,
OsnatelTVLiveIE,
OsnatelTVRecordingsIE,
QuantumTVIE, QuantumTVIE,
QuantumTVLiveIE,
QuantumTVRecordingsIE,
SaltTVIE, SaltTVIE,
SaltTVLiveIE,
SaltTVRecordingsIE,
SAKTVIE, SAKTVIE,
SAKTVLiveIE,
SAKTVRecordingsIE,
VTXTVIE, VTXTVIE,
VTXTVLiveIE,
VTXTVRecordingsIE,
WalyTVIE, WalyTVIE,
WalyTVLiveIE,
WalyTVRecordingsIE,
ZattooIE, ZattooIE,
ZattooLiveIE, ZattooLiveIE,
ZattooMoviesIE, ZattooMoviesIE,

View File

@ -365,7 +365,7 @@ def _real_extract(self, url):
# read breadcrumb on top of page # read breadcrumb on top of page
breadcrumb = self._extract_breadcrumb_list(webpage, video_id) breadcrumb = self._extract_breadcrumb_list(webpage, video_id)
if breadcrumb: if breadcrumb:
# breadcrumb list translates to: (example is 1st test for this IE) # breadcrumb list translates to: (e.g. 1st test for this IE)
# Home > Anime (genre) > Isekai Shokudo 2 (series name) > Episode 1 "Cheese cakes" "Morning again" (episode title) # Home > Anime (genre) > Isekai Shokudo 2 (series name) > Episode 1 "Cheese cakes" "Morning again" (episode title)
# hence this works # hence this works
info['series'] = breadcrumb[-2] info['series'] = breadcrumb[-2]

View File

@ -84,7 +84,7 @@ def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
json_all = self._search_json(r'window.videoInfo\s*=\s*', webpage, 'videoInfo', video_id) json_all = self._search_json(r'window.videoInfo\s*=', webpage, 'videoInfo', video_id)
title = json_all.get('title') title = json_all.get('title')
video_list = json_all.get('videoList') or [] video_list = json_all.get('videoList') or []
@ -164,7 +164,7 @@ def _real_extract(self, url):
video_id = f'{video_id}{format_field(ac_idx, template="__%s")}' video_id = f'{video_id}{format_field(ac_idx, template="__%s")}'
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
json_bangumi_data = self._search_json(r'window.bangumiData\s*=\s*', webpage, 'bangumiData', video_id) json_bangumi_data = self._search_json(r'window.bangumiData\s*=', webpage, 'bangumiData', video_id)
if ac_idx: if ac_idx:
video_info = json_bangumi_data['hlVideoInfo'] video_info = json_bangumi_data['hlVideoInfo']
@ -181,7 +181,7 @@ def _real_extract(self, url):
if v.get('id') == season_id), 1) if v.get('id') == season_id), 1)
json_bangumi_list = self._search_json( json_bangumi_list = self._search_json(
r'window\.bangumiList\s*=\s*', webpage, 'bangumiList', video_id, fatal=False) r'window\.bangumiList\s*=', webpage, 'bangumiList', video_id, fatal=False)
video_internal_id = int_or_none(traverse_obj(json_bangumi_data, ('currentVideoInfo', 'id'))) video_internal_id = int_or_none(traverse_obj(json_bangumi_data, ('currentVideoInfo', 'id')))
episode_number = video_internal_id and next(( episode_number = video_internal_id and next((
idx for idx, v in enumerate(json_bangumi_list.get('items') or [], 1) idx for idx, v in enumerate(json_bangumi_list.get('items') or [], 1)

View File

@ -1344,6 +1344,11 @@
'username_field': 'username', 'username_field': 'username',
'password_field': 'password', 'password_field': 'password',
}, },
'AlticeOne': {
'name': 'Optimum TV',
'username_field': 'j_username',
'password_field': 'j_password',
},
} }
@ -1705,7 +1710,7 @@ def extract_redirect_url(html, url=None, fatal=False):
mso_info.get('username_field', 'username'): username, mso_info.get('username_field', 'username'): username,
mso_info.get('password_field', 'password'): password mso_info.get('password_field', 'password'): password
} }
if mso_id == 'Cablevision': if mso_id in ('Cablevision', 'AlticeOne'):
form_data['_eventId_proceed'] = '' form_data['_eventId_proceed'] = ''
mvpd_confirm_page_res = post_form(provider_login_page_res, 'Logging in', form_data) mvpd_confirm_page_res = post_form(provider_login_page_res, 'Logging in', form_data)
if mso_id != 'Rogers': if mso_id != 'Rogers':

View File

@ -28,14 +28,17 @@ class AENetworksBaseIE(ThePlatformIE):
} }
def _extract_aen_smil(self, smil_url, video_id, auth=None): def _extract_aen_smil(self, smil_url, video_id, auth=None):
query = {'mbr': 'true'} query = {
'mbr': 'true',
'formats': 'M3U+none,MPEG-DASH+none,MPEG4,MP3',
}
if auth: if auth:
query['auth'] = auth query['auth'] = auth
TP_SMIL_QUERY = [{ TP_SMIL_QUERY = [{
'assetTypes': 'high_video_ak', 'assetTypes': 'high_video_ak',
'switch': 'hls_high_ak' 'switch': 'hls_high_ak',
}, { }, {
'assetTypes': 'high_video_s3' 'assetTypes': 'high_video_s3',
}, { }, {
'assetTypes': 'high_video_s3', 'assetTypes': 'high_video_s3',
'switch': 'hls_high_fastly', 'switch': 'hls_high_fastly',

View File

@ -0,0 +1,40 @@
from .common import InfoExtractor
from .vimeo import VimeoIE
class AeonCoIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?aeon\.co/videos/(?P<id>[^/?]+)'
_TESTS = [{
'url': 'https://aeon.co/videos/raw-solar-storm-footage-is-the-punk-rock-antidote-to-sleek-james-webb-imagery',
'md5': 'e5884d80552c9b6ea8d268a258753362',
'info_dict': {
'id': '1284717',
'ext': 'mp4',
'title': 'Brilliant Noise',
'thumbnail': 'https://i.vimeocdn.com/video/21006315-1a1e49da8b07fd908384a982b4ba9ff0268c509a474576ebdf7b1392f4acae3b-d_960',
'uploader': 'Semiconductor',
'uploader_id': 'semiconductor',
'uploader_url': 'https://vimeo.com/semiconductor',
'duration': 348
}
}, {
'url': 'https://aeon.co/videos/dazzling-timelapse-shows-how-microbes-spoil-our-food-and-sometimes-enrich-it',
'md5': '4e5f3dad9dbda0dbfa2da41a851e631e',
'info_dict': {
'id': '728595228',
'ext': 'mp4',
'title': 'Wrought',
'thumbnail': 'https://i.vimeocdn.com/video/1484618528-c91452611f9a4e4497735a533da60d45b2fe472deb0c880f0afaab0cd2efb22a-d_1280',
'uploader': 'Biofilm Productions',
'uploader_id': 'user140352216',
'uploader_url': 'https://vimeo.com/user140352216',
'duration': 1344
}
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
vimeo_id = self._search_regex(r'hosterId":\s*"(?P<id>[0-9]+)', webpage, 'vimeo id')
vimeo_url = VimeoIE._smuggle_referrer(f'https://player.vimeo.com/video/{vimeo_id}', 'https://aeon.co')
return self.url_result(vimeo_url, VimeoIE)

View File

@ -1,5 +1,5 @@
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import int_or_none from ..utils import ExtractorError, int_or_none
class AmazonStoreIE(InfoExtractor): class AmazonStoreIE(InfoExtractor):
@ -9,7 +9,7 @@ class AmazonStoreIE(InfoExtractor):
'url': 'https://www.amazon.co.uk/dp/B098XNCHLD/', 'url': 'https://www.amazon.co.uk/dp/B098XNCHLD/',
'info_dict': { 'info_dict': {
'id': 'B098XNCHLD', 'id': 'B098XNCHLD',
'title': 'md5:5f3194dbf75a8dcfc83079bd63a2abed', 'title': 'md5:dae240564cbb2642170c02f7f0d7e472',
}, },
'playlist_mincount': 1, 'playlist_mincount': 1,
'playlist': [{ 'playlist': [{
@ -18,28 +18,44 @@ class AmazonStoreIE(InfoExtractor):
'ext': 'mp4', 'ext': 'mp4',
'title': 'mcdodo usb c cable 100W 5a', 'title': 'mcdodo usb c cable 100W 5a',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 34,
}, },
}] }]
}, { }, {
'url': 'https://www.amazon.in/Sony-WH-1000XM4-Cancelling-Headphones-Bluetooth/dp/B0863TXGM3', 'url': 'https://www.amazon.in/Sony-WH-1000XM4-Cancelling-Headphones-Bluetooth/dp/B0863TXGM3',
'info_dict': { 'info_dict': {
'id': 'B0863TXGM3', 'id': 'B0863TXGM3',
'title': 'md5:b0bde4881d3cfd40d63af19f7898b8ff', 'title': 'md5:d1d3352428f8f015706c84b31e132169',
}, },
'playlist_mincount': 4, 'playlist_mincount': 4,
}, { }, {
'url': 'https://www.amazon.com/dp/B0845NXCXF/', 'url': 'https://www.amazon.com/dp/B0845NXCXF/',
'info_dict': { 'info_dict': {
'id': 'B0845NXCXF', 'id': 'B0845NXCXF',
'title': 'md5:2145cd4e3c7782f1ee73649a3cff1171', 'title': 'md5:f3fa12779bf62ddb6a6ec86a360a858e',
}, },
'playlist-mincount': 1, 'playlist-mincount': 1,
}, {
'url': 'https://www.amazon.es/Samsung-Smartphone-s-AMOLED-Quad-c%C3%A1mara-espa%C3%B1ola/dp/B08WX337PQ',
'info_dict': {
'id': 'B08WX337PQ',
'title': 'md5:f3fa12779bf62ddb6a6ec86a360a858e',
},
'playlist_mincount': 1,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
id = self._match_id(url) id = self._match_id(url)
for retry in self.RetryManager():
webpage = self._download_webpage(url, id) webpage = self._download_webpage(url, id)
data_json = self._parse_json(self._html_search_regex(r'var\s?obj\s?=\s?jQuery\.parseJSON\(\'(.*)\'\)', webpage, 'data'), id) try:
data_json = self._search_json(
r'var\s?obj\s?=\s?jQuery\.parseJSON\(\'', webpage, 'data', id,
transform_source=lambda x: x.replace(R'\\u', R'\u'))
except ExtractorError as e:
retry.error = e
entries = [{ entries = [{
'id': video['marketPlaceID'], 'id': video['marketPlaceID'],
'url': video['url'], 'url': video['url'],
@ -49,4 +65,4 @@ def _real_extract(self, url):
'height': int_or_none(video.get('videoHeight')), 'height': int_or_none(video.get('videoHeight')),
'width': int_or_none(video.get('videoWidth')), 'width': int_or_none(video.get('videoWidth')),
} for video in (data_json.get('videos') or []) if video.get('isVideo') and video.get('url')] } for video in (data_json.get('videos') or []) if video.get('isVideo') and video.get('url')]
return self.playlist_result(entries, playlist_id=id, playlist_title=data_json['title']) return self.playlist_result(entries, playlist_id=id, playlist_title=data_json.get('title'))

View File

@ -1,282 +0,0 @@
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
determine_ext,
extract_attributes,
ExtractorError,
join_nonempty,
url_or_none,
urlencode_postdata,
urljoin,
)
class AnimeOnDemandIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?anime-on-demand\.de/anime/(?P<id>\d+)'
_LOGIN_URL = 'https://www.anime-on-demand.de/users/sign_in'
_APPLY_HTML5_URL = 'https://www.anime-on-demand.de/html5apply'
_NETRC_MACHINE = 'animeondemand'
# German-speaking countries of Europe
_GEO_COUNTRIES = ['AT', 'CH', 'DE', 'LI', 'LU']
_TESTS = [{
# jap, OmU
'url': 'https://www.anime-on-demand.de/anime/161',
'info_dict': {
'id': '161',
'title': 'Grimgar, Ashes and Illusions (OmU)',
'description': 'md5:6681ce3c07c7189d255ac6ab23812d31',
},
'playlist_mincount': 4,
}, {
# Film wording is used instead of Episode, ger/jap, Dub/OmU
'url': 'https://www.anime-on-demand.de/anime/39',
'only_matching': True,
}, {
# Episodes without titles, jap, OmU
'url': 'https://www.anime-on-demand.de/anime/162',
'only_matching': True,
}, {
# ger/jap, Dub/OmU, account required
'url': 'https://www.anime-on-demand.de/anime/169',
'only_matching': True,
}, {
# Full length film, non-series, ger/jap, Dub/OmU, account required
'url': 'https://www.anime-on-demand.de/anime/185',
'only_matching': True,
}, {
# Flash videos
'url': 'https://www.anime-on-demand.de/anime/12',
'only_matching': True,
}]
def _perform_login(self, username, password):
login_page = self._download_webpage(
self._LOGIN_URL, None, 'Downloading login page')
if '>Our licensing terms allow the distribution of animes only to German-speaking countries of Europe' in login_page:
self.raise_geo_restricted(
'%s is only available in German-speaking countries of Europe' % self.IE_NAME)
login_form = self._form_hidden_inputs('new_user', login_page)
login_form.update({
'user[login]': username,
'user[password]': password,
})
post_url = self._search_regex(
r'<form[^>]+action=(["\'])(?P<url>.+?)\1', login_page,
'post url', default=self._LOGIN_URL, group='url')
if not post_url.startswith('http'):
post_url = urljoin(self._LOGIN_URL, post_url)
response = self._download_webpage(
post_url, None, 'Logging in',
data=urlencode_postdata(login_form), headers={
'Referer': self._LOGIN_URL,
})
if all(p not in response for p in ('>Logout<', 'href="/users/sign_out"')):
error = self._search_regex(
r'<p[^>]+\bclass=(["\'])(?:(?!\1).)*\balert\b(?:(?!\1).)*\1[^>]*>(?P<error>.+?)</p>',
response, 'error', default=None, group='error')
if error:
raise ExtractorError('Unable to login: %s' % error, expected=True)
raise ExtractorError('Unable to log in')
def _real_extract(self, url):
anime_id = self._match_id(url)
webpage = self._download_webpage(url, anime_id)
if 'data-playlist=' not in webpage:
self._download_webpage(
self._APPLY_HTML5_URL, anime_id,
'Activating HTML5 beta', 'Unable to apply HTML5 beta')
webpage = self._download_webpage(url, anime_id)
csrf_token = self._html_search_meta(
'csrf-token', webpage, 'csrf token', fatal=True)
anime_title = self._html_search_regex(
r'(?s)<h1[^>]+itemprop="name"[^>]*>(.+?)</h1>',
webpage, 'anime name')
anime_description = self._html_search_regex(
r'(?s)<div[^>]+itemprop="description"[^>]*>(.+?)</div>',
webpage, 'anime description', default=None)
def extract_info(html, video_id, num=None):
title, description = [None] * 2
formats = []
for input_ in re.findall(
r'<input[^>]+class=["\'].*?streamstarter[^>]+>', html):
attributes = extract_attributes(input_)
title = attributes.get('data-dialog-header')
playlist_urls = []
for playlist_key in ('data-playlist', 'data-otherplaylist', 'data-stream'):
playlist_url = attributes.get(playlist_key)
if isinstance(playlist_url, compat_str) and re.match(
r'/?[\da-zA-Z]+', playlist_url):
playlist_urls.append(attributes[playlist_key])
if not playlist_urls:
continue
lang = attributes.get('data-lang')
lang_note = attributes.get('value')
for playlist_url in playlist_urls:
kind = self._search_regex(
r'videomaterialurl/\d+/([^/]+)/',
playlist_url, 'media kind', default=None)
format_id = join_nonempty(lang, kind) if lang or kind else str(num)
format_note = join_nonempty(kind, lang_note, delim=', ')
item_id_list = []
if format_id:
item_id_list.append(format_id)
item_id_list.append('videomaterial')
playlist = self._download_json(
urljoin(url, playlist_url), video_id,
'Downloading %s JSON' % ' '.join(item_id_list),
headers={
'X-Requested-With': 'XMLHttpRequest',
'X-CSRF-Token': csrf_token,
'Referer': url,
'Accept': 'application/json, text/javascript, */*; q=0.01',
}, fatal=False)
if not playlist:
continue
stream_url = url_or_none(playlist.get('streamurl'))
if stream_url:
rtmp = re.search(
r'^(?P<url>rtmpe?://(?P<host>[^/]+)/(?P<app>.+/))(?P<playpath>mp[34]:.+)',
stream_url)
if rtmp:
formats.append({
'url': rtmp.group('url'),
'app': rtmp.group('app'),
'play_path': rtmp.group('playpath'),
'page_url': url,
'player_url': 'https://www.anime-on-demand.de/assets/jwplayer.flash-55abfb34080700304d49125ce9ffb4a6.swf',
'rtmp_real_time': True,
'format_id': 'rtmp',
'ext': 'flv',
})
continue
start_video = playlist.get('startvideo', 0)
playlist = playlist.get('playlist')
if not playlist or not isinstance(playlist, list):
continue
playlist = playlist[start_video]
title = playlist.get('title')
if not title:
continue
description = playlist.get('description')
for source in playlist.get('sources', []):
file_ = source.get('file')
if not file_:
continue
ext = determine_ext(file_)
format_id = join_nonempty(
lang, kind,
'hls' if ext == 'm3u8' else None,
'dash' if source.get('type') == 'video/dash' or ext == 'mpd' else None)
if ext == 'm3u8':
file_formats = self._extract_m3u8_formats(
file_, video_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id=format_id, fatal=False)
elif source.get('type') == 'video/dash' or ext == 'mpd':
continue
file_formats = self._extract_mpd_formats(
file_, video_id, mpd_id=format_id, fatal=False)
else:
continue
for f in file_formats:
f.update({
'language': lang,
'format_note': format_note,
})
formats.extend(file_formats)
return {
'title': title,
'description': description,
'formats': formats,
}
def extract_entries(html, video_id, common_info, num=None):
info = extract_info(html, video_id, num)
if info['formats']:
self._sort_formats(info['formats'])
f = common_info.copy()
f.update(info)
yield f
# Extract teaser/trailer only when full episode is not available
if not info['formats']:
m = re.search(
r'data-dialog-header=(["\'])(?P<title>.+?)\1[^>]+href=(["\'])(?P<href>.+?)\3[^>]*>(?P<kind>Teaser|Trailer)<',
html)
if m:
f = common_info.copy()
f.update({
'id': '%s-%s' % (f['id'], m.group('kind').lower()),
'title': m.group('title'),
'url': urljoin(url, m.group('href')),
})
yield f
def extract_episodes(html):
for num, episode_html in enumerate(re.findall(
r'(?s)<h3[^>]+class="episodebox-title".+?>Episodeninhalt<', html), 1):
episodebox_title = self._search_regex(
(r'class="episodebox-title"[^>]+title=(["\'])(?P<title>.+?)\1',
r'class="episodebox-title"[^>]+>(?P<title>.+?)<'),
episode_html, 'episodebox title', default=None, group='title')
if not episodebox_title:
continue
episode_number = int(self._search_regex(
r'(?:Episode|Film)\s*(\d+)',
episodebox_title, 'episode number', default=num))
episode_title = self._search_regex(
r'(?:Episode|Film)\s*\d+\s*-\s*(.+)',
episodebox_title, 'episode title', default=None)
video_id = 'episode-%d' % episode_number
common_info = {
'id': video_id,
'series': anime_title,
'episode': episode_title,
'episode_number': episode_number,
}
for e in extract_entries(episode_html, video_id, common_info):
yield e
def extract_film(html, video_id):
common_info = {
'id': anime_id,
'title': anime_title,
'description': anime_description,
}
for e in extract_entries(html, video_id, common_info):
yield e
def entries():
has_episodes = False
for e in extract_episodes(webpage):
has_episodes = True
yield e
if not has_episodes:
for e in extract_film(webpage, anime_id):
yield e
return self.playlist_result(
entries(), anime_id, anime_title, anime_description)

View File

@ -5,31 +5,70 @@
import re import re
import time import time
from .anvato_token_generator import NFLTokenGenerator
from .common import InfoExtractor from .common import InfoExtractor
from ..aes import aes_encrypt from ..aes import aes_encrypt
from ..compat import compat_str
from ..utils import ( from ..utils import (
bytes_to_intlist, bytes_to_intlist,
determine_ext, determine_ext,
intlist_to_bytes,
int_or_none, int_or_none,
intlist_to_bytes,
join_nonempty, join_nonempty,
smuggle_url,
strip_jsonp, strip_jsonp,
traverse_obj,
unescapeHTML, unescapeHTML,
unsmuggle_url, unsmuggle_url,
) )
def md5_text(s): def md5_text(s):
if not isinstance(s, compat_str): return hashlib.md5(str(s).encode()).hexdigest()
s = compat_str(s)
return hashlib.md5(s.encode('utf-8')).hexdigest()
class AnvatoIE(InfoExtractor): class AnvatoIE(InfoExtractor):
_VALID_URL = r'anvato:(?P<access_key_or_mcp>[^:]+):(?P<id>\d+)' _VALID_URL = r'anvato:(?P<access_key_or_mcp>[^:]+):(?P<id>\d+)'
_API_BASE_URL = 'https://tkx.mp.lura.live/rest/v2'
_ANVP_RE = r'<script[^>]+\bdata-anvp\s*=\s*(["\'])(?P<anvp>(?:(?!\1).)+)\1'
_AUTH_KEY = b'\x31\xc2\x42\x84\x9e\x73\xa0\xce' # from anvplayer.min.js
_TESTS = [{
# from https://www.nfl.com/videos/baker-mayfield-s-game-changing-plays-from-3-td-game-week-14
'url': 'anvato:GXvEgwyJeWem8KCYXfeoHWknwP48Mboj:899441',
'md5': '921919dab3cd0b849ff3d624831ae3e2',
'info_dict': {
'id': '899441',
'ext': 'mp4',
'title': 'Baker Mayfield\'s game-changing plays from 3-TD game Week 14',
'description': 'md5:85e05a3cc163f8c344340f220521136d',
'upload_date': '20201215',
'timestamp': 1608009755,
'thumbnail': r're:^https?://.*\.jpg',
'uploader': 'NFL',
'tags': ['Baltimore Ravens at Cleveland Browns (2020-REG-14)', 'Baker Mayfield', 'Game Highlights',
'Player Highlights', 'Cleveland Browns', 'league'],
'duration': 157,
'categories': ['Entertainment', 'Game', 'Highlights'],
},
}, {
# from https://ktla.com/news/99-year-old-woman-learns-to-fly-in-torrance-checks-off-bucket-list-dream/
'url': 'anvato:X8POa4zpGZMmeiq0wqiO8IP5rMqQM9VN:8032455',
'md5': '837718bcfb3a7778d022f857f7a9b19e',
'info_dict': {
'id': '8032455',
'ext': 'mp4',
'title': '99-year-old woman learns to fly plane in Torrance, checks off bucket list dream',
'description': 'md5:0a12bab8159445e78f52a297a35c6609',
'upload_date': '20220928',
'timestamp': 1664408881,
'thumbnail': r're:^https?://.*\.jpg',
'uploader': 'LIN',
'tags': ['video', 'news', '5live'],
'duration': 155,
'categories': ['News'],
},
}]
# Copied from anvplayer.min.js # Copied from anvplayer.min.js
_ANVACK_TABLE = { _ANVACK_TABLE = {
'nbcu_nbcd_desktop_web_prod_93d8ead38ce2024f8f544b78306fbd15895ae5e6': 'NNemUkySjxLyPTKvZRiGntBIjEyK8uqicjMakIaQ', 'nbcu_nbcd_desktop_web_prod_93d8ead38ce2024f8f544b78306fbd15895ae5e6': 'NNemUkySjxLyPTKvZRiGntBIjEyK8uqicjMakIaQ',
@ -202,86 +241,74 @@ class AnvatoIE(InfoExtractor):
'telemundo': 'anvato_mcp_telemundo_web_prod_c5278d51ad46fda4b6ca3d0ea44a7846a054f582' 'telemundo': 'anvato_mcp_telemundo_web_prod_c5278d51ad46fda4b6ca3d0ea44a7846a054f582'
} }
def _generate_nfl_token(self, anvack, mcp_id):
reroute = self._download_json(
'https://api.nfl.com/v1/reroute', mcp_id, data=b'grant_type=client_credentials',
headers={'X-Domain-Id': 100}, note='Fetching token info')
token_type = reroute.get('token_type') or 'Bearer'
auth_token = f'{token_type} {reroute["access_token"]}'
response = self._download_json(
'https://api.nfl.com/v3/shield/', mcp_id, data=json.dumps({
'query': '''{
viewer {
mediaToken(anvack: "%s", id: %s) {
token
}
}
}''' % (anvack, mcp_id),
}).encode(), headers={
'Authorization': auth_token,
'Content-Type': 'application/json',
}, note='Fetching NFL API token')
return traverse_obj(response, ('data', 'viewer', 'mediaToken', 'token'))
_TOKEN_GENERATORS = { _TOKEN_GENERATORS = {
'GXvEgwyJeWem8KCYXfeoHWknwP48Mboj': NFLTokenGenerator, 'GXvEgwyJeWem8KCYXfeoHWknwP48Mboj': _generate_nfl_token,
} }
_API_KEY = '3hwbSuqqT690uxjNYBktSQpa5ZrpYYR0Iofx7NcJHyA'
_ANVP_RE = r'<script[^>]+\bdata-anvp\s*=\s*(["\'])(?P<anvp>(?:(?!\1).)+)\1'
_AUTH_KEY = b'\x31\xc2\x42\x84\x9e\x73\xa0\xce'
_TESTS = [{
# from https://www.boston25news.com/news/watch-humpback-whale-breaches-right-next-to-fishing-boat-near-nh/817484874
'url': 'anvato:8v9BEynrwx8EFLYpgfOWcG1qJqyXKlRM:4465496',
'info_dict': {
'id': '4465496',
'ext': 'mp4',
'title': 'VIDEO: Humpback whale breaches right next to NH boat',
'description': 'VIDEO: Humpback whale breaches right next to NH boat. Footage courtesy: Zach Fahey.',
'duration': 22,
'timestamp': 1534855680,
'upload_date': '20180821',
'uploader': 'ANV',
},
'params': {
'skip_download': True,
},
}, {
# from https://sanfrancisco.cbslocal.com/2016/06/17/source-oakland-cop-on-leave-for-having-girlfriend-help-with-police-reports/
'url': 'anvato:DVzl9QRzox3ZZsP9bNu5Li3X7obQOnqP:3417601',
'only_matching': True,
}]
def __init__(self, *args, **kwargs):
super(AnvatoIE, self).__init__(*args, **kwargs)
self.__server_time = None
def _server_time(self, access_key, video_id): def _server_time(self, access_key, video_id):
if self.__server_time is not None: return int_or_none(traverse_obj(self._download_json(
return self.__server_time f'{self._API_BASE_URL}/server_time', video_id, query={'anvack': access_key},
note='Fetching server time', fatal=False), 'server_time')) or int(time.time())
self.__server_time = int(self._download_json( def _get_video_json(self, access_key, video_id, extracted_token):
self._api_prefix(access_key) + 'server_time?anvack=' + access_key, video_id,
note='Fetching server time')['server_time'])
return self.__server_time
def _api_prefix(self, access_key):
return 'https://tkx2-%s.anvato.net/rest/v2/' % ('prod' if 'prod' in access_key else 'stage')
def _get_video_json(self, access_key, video_id):
# See et() in anvplayer.min.js, which is an alias of getVideoJSON() # See et() in anvplayer.min.js, which is an alias of getVideoJSON()
video_data_url = self._api_prefix(access_key) + 'mcp/video/%s?anvack=%s' % (video_id, access_key) video_data_url = f'{self._API_BASE_URL}/mcp/video/{video_id}?anvack={access_key}'
server_time = self._server_time(access_key, video_id) server_time = self._server_time(access_key, video_id)
input_data = '%d~%s~%s' % (server_time, md5_text(video_data_url), md5_text(server_time)) input_data = f'{server_time}~{md5_text(video_data_url)}~{md5_text(server_time)}'
auth_secret = intlist_to_bytes(aes_encrypt( auth_secret = intlist_to_bytes(aes_encrypt(
bytes_to_intlist(input_data[:64]), bytes_to_intlist(self._AUTH_KEY))) bytes_to_intlist(input_data[:64]), bytes_to_intlist(self._AUTH_KEY)))
query = {
video_data_url += '&X-Anvato-Adst-Auth=' + base64.b64encode(auth_secret).decode('ascii') 'X-Anvato-Adst-Auth': base64.b64encode(auth_secret).decode('ascii'),
'rtyp': 'fp',
}
anvrid = md5_text(time.time() * 1000 * random.random())[:30] anvrid = md5_text(time.time() * 1000 * random.random())[:30]
api = { api = {
'anvrid': anvrid, 'anvrid': anvrid,
'anvts': server_time, 'anvts': server_time,
} }
if self._TOKEN_GENERATORS.get(access_key) is not None: if extracted_token is not None:
api['anvstk2'] = self._TOKEN_GENERATORS[access_key].generate(self, access_key, video_id) api['anvstk2'] = extracted_token
elif self._TOKEN_GENERATORS.get(access_key) is not None:
api['anvstk2'] = self._TOKEN_GENERATORS[access_key](self, access_key, video_id)
elif self._ANVACK_TABLE.get(access_key) is not None:
api['anvstk'] = md5_text(f'{access_key}|{anvrid}|{server_time}|{self._ANVACK_TABLE[access_key]}')
else: else:
api['anvstk'] = md5_text('%s|%s|%d|%s' % ( api['anvstk2'] = 'default'
access_key, anvrid, server_time,
self._ANVACK_TABLE.get(access_key, self._API_KEY)))
return self._download_json( return self._download_json(
video_data_url, video_id, transform_source=strip_jsonp, video_data_url, video_id, transform_source=strip_jsonp, query=query,
data=json.dumps({'api': api}).encode('utf-8')) data=json.dumps({'api': api}, separators=(',', ':')).encode('utf-8'))
def _get_anvato_videos(self, access_key, video_id): def _get_anvato_videos(self, access_key, video_id, token):
video_data = self._get_video_json(access_key, video_id) video_data = self._get_video_json(access_key, video_id, token)
formats = [] formats = []
for published_url in video_data['published_urls']: for published_url in video_data['published_urls']:
video_url = published_url['embed_url'] video_url = published_url.get('embed_url')
if not video_url:
continue
media_format = published_url.get('format') media_format = published_url.get('format')
ext = determine_ext(video_url) ext = determine_ext(video_url)
@ -296,15 +323,27 @@ def _get_anvato_videos(self, access_key, video_id):
'tbr': tbr or None, 'tbr': tbr or None,
} }
if media_format == 'm3u8' and tbr is not None: vtt_subs, hls_subs = {}, {}
if media_format == 'vtt':
_, vtt_subs = self._extract_m3u8_formats_and_subtitles(
video_url, video_id, m3u8_id='vtt', fatal=False)
continue
elif media_format == 'm3u8' and tbr is not None:
a_format.update({ a_format.update({
'format_id': join_nonempty('hls', tbr), 'format_id': join_nonempty('hls', tbr),
'ext': 'mp4', 'ext': 'mp4',
}) })
elif media_format == 'm3u8-variant' or ext == 'm3u8': elif media_format == 'm3u8-variant' or ext == 'm3u8':
formats.extend(self._extract_m3u8_formats( # For some videos the initial m3u8 URL returns JSON instead
video_url, video_id, 'mp4', entry_protocol='m3u8_native', manifest_json = self._download_json(
m3u8_id='hls', fatal=False)) video_url, video_id, note='Downloading manifest JSON', errnote=False)
if manifest_json:
video_url = manifest_json.get('master_m3u8')
if not video_url:
continue
hls_fmts, hls_subs = self._extract_m3u8_formats_and_subtitles(
video_url, video_id, ext='mp4', m3u8_id='hls', fatal=False)
formats.extend(hls_fmts)
continue continue
elif ext == 'mp3' or media_format == 'mp3': elif ext == 'mp3' or media_format == 'mp3':
a_format['vcodec'] = 'none' a_format['vcodec'] = 'none'
@ -324,6 +363,7 @@ def _get_anvato_videos(self, access_key, video_id):
'ext': 'tt' if caption.get('format') == 'SMPTE-TT' else None 'ext': 'tt' if caption.get('format') == 'SMPTE-TT' else None
} }
subtitles.setdefault(caption['language'], []).append(a_caption) subtitles.setdefault(caption['language'], []).append(a_caption)
subtitles = self._merge_subtitles(subtitles, hls_subs, vtt_subs)
return { return {
'id': video_id, 'id': video_id,
@ -349,7 +389,10 @@ def _extract_from_webpage(cls, url, webpage):
access_key = cls._MCP_TO_ACCESS_KEY_TABLE.get((anvplayer_data.get('mcp') or '').lower()) access_key = cls._MCP_TO_ACCESS_KEY_TABLE.get((anvplayer_data.get('mcp') or '').lower())
if not (video_id or '').isdigit() or not access_key: if not (video_id or '').isdigit() or not access_key:
continue continue
yield cls.url_result(f'anvato:{access_key}:{video_id}', AnvatoIE, video_id) url = f'anvato:{access_key}:{video_id}'
if anvplayer_data.get('token'):
url = smuggle_url(url, {'token': anvplayer_data['token']})
yield cls.url_result(url, AnvatoIE, video_id)
def _extract_anvato_videos(self, webpage, video_id): def _extract_anvato_videos(self, webpage, video_id):
anvplayer_data = self._parse_json( anvplayer_data = self._parse_json(
@ -357,7 +400,7 @@ def _extract_anvato_videos(self, webpage, video_id):
self._ANVP_RE, webpage, 'Anvato player data', group='anvp'), self._ANVP_RE, webpage, 'Anvato player data', group='anvp'),
video_id) video_id)
return self._get_anvato_videos( return self._get_anvato_videos(
anvplayer_data['accessKey'], anvplayer_data['video']) anvplayer_data['accessKey'], anvplayer_data['video'], 'default') # cbslocal token = 'default'
def _real_extract(self, url): def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {}) url, smuggled_data = unsmuggle_url(url, {})
@ -365,9 +408,7 @@ def _real_extract(self, url):
'countries': smuggled_data.get('geo_countries'), 'countries': smuggled_data.get('geo_countries'),
}) })
mobj = self._match_valid_url(url) access_key, video_id = self._match_valid_url(url).group('access_key_or_mcp', 'id')
access_key, video_id = mobj.group('access_key_or_mcp', 'id')
if access_key not in self._ANVACK_TABLE: if access_key not in self._ANVACK_TABLE:
access_key = self._MCP_TO_ACCESS_KEY_TABLE.get( access_key = self._MCP_TO_ACCESS_KEY_TABLE.get(access_key) or access_key
access_key) or access_key return self._get_anvato_videos(access_key, video_id, smuggled_data.get('token'))
return self._get_anvato_videos(access_key, video_id)

View File

@ -1,5 +0,0 @@
from .nfl import NFLTokenGenerator
__all__ = [
'NFLTokenGenerator',
]

View File

@ -1,3 +0,0 @@
class TokenGenerator:
def generate(self, anvack, mcp_id):
raise NotImplementedError('This method must be implemented by subclasses')

View File

@ -1,28 +0,0 @@
import json
from .common import TokenGenerator
class NFLTokenGenerator(TokenGenerator):
_AUTHORIZATION = None
def generate(ie, anvack, mcp_id):
if not NFLTokenGenerator._AUTHORIZATION:
reroute = ie._download_json(
'https://api.nfl.com/v1/reroute', mcp_id,
data=b'grant_type=client_credentials',
headers={'X-Domain-Id': 100})
NFLTokenGenerator._AUTHORIZATION = '%s %s' % (reroute.get('token_type') or 'Bearer', reroute['access_token'])
return ie._download_json(
'https://api.nfl.com/v3/shield/', mcp_id, data=json.dumps({
'query': '''{
viewer {
mediaToken(anvack: "%s", id: %s) {
token
}
}
}''' % (anvack, mcp_id),
}).encode(), headers={
'Authorization': NFLTokenGenerator._AUTHORIZATION,
'Content-Type': 'application/json',
})['data']['viewer']['mediaToken']['token']

View File

@ -16,6 +16,7 @@
get_element_by_id, get_element_by_id,
int_or_none, int_or_none,
join_nonempty, join_nonempty,
js_to_json,
merge_dicts, merge_dicts,
mimetype2ext, mimetype2ext,
orderedSet, orderedSet,
@ -367,7 +368,9 @@ class YoutubeWebArchiveIE(InfoExtractor):
'channel_id': 'UCukCyHaD-bK3in_pKpfH9Eg', 'channel_id': 'UCukCyHaD-bK3in_pKpfH9Eg',
'duration': 32, 'duration': 32,
'uploader_id': 'Zeurel', 'uploader_id': 'Zeurel',
'uploader_url': 'http://www.youtube.com/user/Zeurel' 'uploader_url': 'https://www.youtube.com/user/Zeurel',
'thumbnail': r're:https?://.*\.(jpg|webp)',
'channel_url': 'https://www.youtube.com/channel/UCukCyHaD-bK3in_pKpfH9Eg',
} }
}, { }, {
# Internal link # Internal link
@ -382,7 +385,9 @@ class YoutubeWebArchiveIE(InfoExtractor):
'channel_id': 'UCHnyfMqiRRG1u-2MsSQLbXA', 'channel_id': 'UCHnyfMqiRRG1u-2MsSQLbXA',
'duration': 771, 'duration': 771,
'uploader_id': '1veritasium', 'uploader_id': '1veritasium',
'uploader_url': 'http://www.youtube.com/user/1veritasium' 'uploader_url': 'https://www.youtube.com/user/1veritasium',
'thumbnail': r're:https?://.*\.(jpg|webp)',
'channel_url': 'https://www.youtube.com/channel/UCHnyfMqiRRG1u-2MsSQLbXA',
} }
}, { }, {
# Video from 2012, webm format itag 45. Newest capture is deleted video, with an invalid description. # Video from 2012, webm format itag 45. Newest capture is deleted video, with an invalid description.
@ -396,7 +401,9 @@ class YoutubeWebArchiveIE(InfoExtractor):
'duration': 398, 'duration': 398,
'description': 'md5:ff4de6a7980cb65d951c2f6966a4f2f3', 'description': 'md5:ff4de6a7980cb65d951c2f6966a4f2f3',
'uploader_id': 'machinima', 'uploader_id': 'machinima',
'uploader_url': 'http://www.youtube.com/user/machinima' 'uploader_url': 'https://www.youtube.com/user/machinima',
'thumbnail': r're:https?://.*\.(jpg|webp)',
'uploader': 'machinima'
} }
}, { }, {
# FLV video. Video file URL does not provide itag information # FLV video. Video file URL does not provide itag information
@ -410,7 +417,10 @@ class YoutubeWebArchiveIE(InfoExtractor):
'duration': 19, 'duration': 19,
'description': 'md5:10436b12e07ac43ff8df65287a56efb4', 'description': 'md5:10436b12e07ac43ff8df65287a56efb4',
'uploader_id': 'jawed', 'uploader_id': 'jawed',
'uploader_url': 'http://www.youtube.com/user/jawed' 'uploader_url': 'https://www.youtube.com/user/jawed',
'channel_url': 'https://www.youtube.com/channel/UC4QobU6STFB0P71PMvOGN5A',
'thumbnail': r're:https?://.*\.(jpg|webp)',
'uploader': 'jawed',
} }
}, { }, {
'url': 'https://web.archive.org/web/20110712231407/http://www.youtube.com/watch?v=lTx3G6h2xyA', 'url': 'https://web.archive.org/web/20110712231407/http://www.youtube.com/watch?v=lTx3G6h2xyA',
@ -424,7 +434,9 @@ class YoutubeWebArchiveIE(InfoExtractor):
'duration': 204, 'duration': 204,
'description': 'md5:f7535343b6eda34a314eff8b85444680', 'description': 'md5:f7535343b6eda34a314eff8b85444680',
'uploader_id': 'itsmadeon', 'uploader_id': 'itsmadeon',
'uploader_url': 'http://www.youtube.com/user/itsmadeon' 'uploader_url': 'https://www.youtube.com/user/itsmadeon',
'channel_url': 'https://www.youtube.com/channel/UCqMDNf3Pn5L7pcNkuSEeO3w',
'thumbnail': r're:https?://.*\.(jpg|webp)',
} }
}, { }, {
# First capture is of dead video, second is the oldest from CDX response. # First capture is of dead video, second is the oldest from CDX response.
@ -435,10 +447,13 @@ class YoutubeWebArchiveIE(InfoExtractor):
'title': 'Fake Teen Doctor Strikes AGAIN! - Weekly Weird News', 'title': 'Fake Teen Doctor Strikes AGAIN! - Weekly Weird News',
'upload_date': '20160218', 'upload_date': '20160218',
'channel_id': 'UCdIaNUarhzLSXGoItz7BHVA', 'channel_id': 'UCdIaNUarhzLSXGoItz7BHVA',
'duration': 1236, 'duration': 1235,
'description': 'md5:21032bae736421e89c2edf36d1936947', 'description': 'md5:21032bae736421e89c2edf36d1936947',
'uploader_id': 'MachinimaETC', 'uploader_id': 'MachinimaETC',
'uploader_url': 'http://www.youtube.com/user/MachinimaETC' 'uploader_url': 'https://www.youtube.com/user/MachinimaETC',
'channel_url': 'https://www.youtube.com/channel/UCdIaNUarhzLSXGoItz7BHVA',
'thumbnail': r're:https?://.*\.(jpg|webp)',
'uploader': 'ETC News',
} }
}, { }, {
# First capture of dead video, capture date in link links to dead capture. # First capture of dead video, capture date in link links to dead capture.
@ -449,10 +464,13 @@ class YoutubeWebArchiveIE(InfoExtractor):
'title': 'WTF: Video Games Still Launch BROKEN?! - T.U.G.S.', 'title': 'WTF: Video Games Still Launch BROKEN?! - T.U.G.S.',
'upload_date': '20160219', 'upload_date': '20160219',
'channel_id': 'UCdIaNUarhzLSXGoItz7BHVA', 'channel_id': 'UCdIaNUarhzLSXGoItz7BHVA',
'duration': 798, 'duration': 797,
'description': 'md5:a1dbf12d9a3bd7cb4c5e33b27d77ffe7', 'description': 'md5:a1dbf12d9a3bd7cb4c5e33b27d77ffe7',
'uploader_id': 'MachinimaETC', 'uploader_id': 'MachinimaETC',
'uploader_url': 'http://www.youtube.com/user/MachinimaETC' 'uploader_url': 'https://www.youtube.com/user/MachinimaETC',
'channel_url': 'https://www.youtube.com/channel/UCdIaNUarhzLSXGoItz7BHVA',
'thumbnail': r're:https?://.*\.(jpg|webp)',
'uploader': 'ETC News',
}, },
'expected_warnings': [ 'expected_warnings': [
r'unable to download capture webpage \(it may not be archived\)' r'unable to download capture webpage \(it may not be archived\)'
@ -472,12 +490,11 @@ class YoutubeWebArchiveIE(InfoExtractor):
'title': 'It\'s Bootleg AirPods Time.', 'title': 'It\'s Bootleg AirPods Time.',
'upload_date': '20211021', 'upload_date': '20211021',
'channel_id': 'UC7Jwj9fkrf1adN4fMmTkpug', 'channel_id': 'UC7Jwj9fkrf1adN4fMmTkpug',
'channel_url': 'http://www.youtube.com/channel/UC7Jwj9fkrf1adN4fMmTkpug', 'channel_url': 'https://www.youtube.com/channel/UC7Jwj9fkrf1adN4fMmTkpug',
'duration': 810, 'duration': 810,
'description': 'md5:7b567f898d8237b256f36c1a07d6d7bc', 'description': 'md5:7b567f898d8237b256f36c1a07d6d7bc',
'thumbnail': r're:https?://.*\.(jpg|webp)',
'uploader': 'DankPods', 'uploader': 'DankPods',
'uploader_id': 'UC7Jwj9fkrf1adN4fMmTkpug',
'uploader_url': 'http://www.youtube.com/channel/UC7Jwj9fkrf1adN4fMmTkpug'
} }
}, { }, {
# player response contains '};' See: https://github.com/ytdl-org/youtube-dl/issues/27093 # player response contains '};' See: https://github.com/ytdl-org/youtube-dl/issues/27093
@ -488,12 +505,135 @@ class YoutubeWebArchiveIE(InfoExtractor):
'title': 'bitch lasagna', 'title': 'bitch lasagna',
'upload_date': '20181005', 'upload_date': '20181005',
'channel_id': 'UC-lHJZR3Gqxm24_Vd_AJ5Yw', 'channel_id': 'UC-lHJZR3Gqxm24_Vd_AJ5Yw',
'channel_url': 'http://www.youtube.com/channel/UC-lHJZR3Gqxm24_Vd_AJ5Yw', 'channel_url': 'https://www.youtube.com/channel/UC-lHJZR3Gqxm24_Vd_AJ5Yw',
'duration': 135, 'duration': 135,
'description': 'md5:2dbe4051feeff2dab5f41f82bb6d11d0', 'description': 'md5:2dbe4051feeff2dab5f41f82bb6d11d0',
'uploader': 'PewDiePie', 'uploader': 'PewDiePie',
'uploader_id': 'PewDiePie', 'uploader_id': 'PewDiePie',
'uploader_url': 'http://www.youtube.com/user/PewDiePie' 'uploader_url': 'https://www.youtube.com/user/PewDiePie',
'thumbnail': r're:https?://.*\.(jpg|webp)',
}
}, {
# ~June 2010 Capture. swfconfig
'url': 'https://web.archive.org/web/0/https://www.youtube.com/watch?v=8XeW5ilk-9Y',
'info_dict': {
'id': '8XeW5ilk-9Y',
'ext': 'flv',
'title': 'Story of Stuff, The Critique Part 4 of 4',
'duration': 541,
'description': 'md5:28157da06f2c5e94c97f7f3072509972',
'uploader': 'HowTheWorldWorks',
'uploader_id': 'HowTheWorldWorks',
'thumbnail': r're:https?://.*\.(jpg|webp)',
'uploader_url': 'https://www.youtube.com/user/HowTheWorldWorks',
'upload_date': '20090520',
}
}, {
# Jan 2011: watch-video-date/eow-date surrounded by whitespace
'url': 'https://web.archive.org/web/20110126141719/http://www.youtube.com/watch?v=Q_yjX80U7Yc',
'info_dict': {
'id': 'Q_yjX80U7Yc',
'ext': 'flv',
'title': 'Spray Paint Art by Clay Butler: Purple Fantasy Forest',
'uploader_id': 'claybutlermusic',
'description': 'md5:4595264559e3d0a0ceb3f011f6334543',
'upload_date': '20090803',
'uploader': 'claybutlermusic',
'thumbnail': r're:https?://.*\.(jpg|webp)',
'duration': 132,
'uploader_url': 'https://www.youtube.com/user/claybutlermusic',
}
}, {
# ~May 2009 swfArgs. ytcfg is spread out over various vars
'url': 'https://web.archive.org/web/0/https://www.youtube.com/watch?v=c5uJgG05xUY',
'info_dict': {
'id': 'c5uJgG05xUY',
'ext': 'webm',
'title': 'Story of Stuff, The Critique Part 1 of 4',
'uploader_id': 'HowTheWorldWorks',
'uploader': 'HowTheWorldWorks',
'uploader_url': 'https://www.youtube.com/user/HowTheWorldWorks',
'upload_date': '20090513',
'description': 'md5:4ca77d79538064e41e4cc464e93f44f0',
'thumbnail': r're:https?://.*\.(jpg|webp)',
'duration': 754,
}
}, {
# ~June 2012. Upload date is in another lang so cannot extract.
'url': 'https://web.archive.org/web/20120607174520/http://www.youtube.com/watch?v=xWTLLl-dQaA',
'info_dict': {
'id': 'xWTLLl-dQaA',
'ext': 'mp4',
'title': 'Black Nerd eHarmony Video Bio Parody (SPOOF)',
'uploader_url': 'https://www.youtube.com/user/BlackNerdComedy',
'description': 'md5:e25f0133aaf9e6793fb81c18021d193e',
'uploader_id': 'BlackNerdComedy',
'uploader': 'BlackNerdComedy',
'duration': 182,
'thumbnail': r're:https?://.*\.(jpg|webp)',
}
}, {
# ~July 2013
'url': 'https://web.archive.org/web/*/https://www.youtube.com/watch?v=9eO1aasHyTM',
'info_dict': {
'id': '9eO1aasHyTM',
'ext': 'mp4',
'title': 'Polar-oid',
'description': 'Cameras and bears are dangerous!',
'uploader_url': 'https://www.youtube.com/user/punkybird',
'uploader_id': 'punkybird',
'duration': 202,
'channel_id': 'UC62R2cBezNBOqxSerfb1nMQ',
'channel_url': 'https://www.youtube.com/channel/UC62R2cBezNBOqxSerfb1nMQ',
'upload_date': '20060428',
'uploader': 'punkybird',
}
}, {
# April 2020: Player response in player config
'url': 'https://web.archive.org/web/20200416034815/https://www.youtube.com/watch?v=Cf7vS8jc7dY&gl=US&hl=en',
'info_dict': {
'id': 'Cf7vS8jc7dY',
'ext': 'mp4',
'title': 'A Dramatic Pool Story (by Jamie Spicer-Lewis) - Game Grumps Animated',
'duration': 64,
'upload_date': '20200408',
'uploader_id': 'GameGrumps',
'uploader': 'GameGrumps',
'channel_url': 'https://www.youtube.com/channel/UC9CuvdOVfMPvKCiwdGKL3cQ',
'channel_id': 'UC9CuvdOVfMPvKCiwdGKL3cQ',
'thumbnail': r're:https?://.*\.(jpg|webp)',
'description': 'md5:c625bb3c02c4f5fb4205971e468fa341',
'uploader_url': 'https://www.youtube.com/user/GameGrumps',
}
}, {
# watch7-user-header with yt-user-info
'url': 'ytarchive:kbh4T_b4Ixw:20160307085057',
'info_dict': {
'id': 'kbh4T_b4Ixw',
'ext': 'mp4',
'title': 'Shovel Knight OST - Strike the Earth! Plains of Passage 16 bit SNES style remake / remix',
'channel_url': 'https://www.youtube.com/channel/UCnTaGvsHmMy792DWeT6HbGA',
'uploader': 'Nelward music',
'duration': 213,
'description': 'md5:804b4a9ce37b050a5fefdbb23aeba54d',
'thumbnail': r're:https?://.*\.(jpg|webp)',
'upload_date': '20150503',
'channel_id': 'UCnTaGvsHmMy792DWeT6HbGA',
}
}, {
# April 2012
'url': 'https://web.archive.org/web/0/https://www.youtube.com/watch?v=SOm7mPoPskU',
'info_dict': {
'id': 'SOm7mPoPskU',
'ext': 'mp4',
'title': 'Boyfriend - Justin Bieber Parody',
'uploader_url': 'https://www.youtube.com/user/thecomputernerd01',
'uploader': 'thecomputernerd01',
'thumbnail': r're:https?://.*\.(jpg|webp)',
'description': 'md5:dd7fa635519c2a5b4d566beaecad7491',
'duration': 200,
'upload_date': '20120407',
'uploader_id': 'thecomputernerd01',
} }
}, { }, {
'url': 'https://web.archive.org/web/http://www.youtube.com/watch?v=kH-G_aIBlFw', 'url': 'https://web.archive.org/web/http://www.youtube.com/watch?v=kH-G_aIBlFw',
@ -526,9 +666,10 @@ class YoutubeWebArchiveIE(InfoExtractor):
}, },
] ]
_YT_INITIAL_DATA_RE = YoutubeBaseInfoExtractor._YT_INITIAL_DATA_RE _YT_INITIAL_DATA_RE = YoutubeBaseInfoExtractor._YT_INITIAL_DATA_RE
_YT_INITIAL_PLAYER_RESPONSE_RE = fr'''(?x) _YT_INITIAL_PLAYER_RESPONSE_RE = fr'''(?x:
(?:window\s*\[\s*["\']ytInitialPlayerResponse["\']\s*\]|ytInitialPlayerResponse)\s*=[(\s]*| (?:window\s*\[\s*["\']ytInitialPlayerResponse["\']\s*\]|ytInitialPlayerResponse)\s*=[(\s]*|
{YoutubeBaseInfoExtractor._YT_INITIAL_PLAYER_RESPONSE_RE}''' {YoutubeBaseInfoExtractor._YT_INITIAL_PLAYER_RESPONSE_RE}
)'''
_YT_DEFAULT_THUMB_SERVERS = ['i.ytimg.com'] # thumbnails most likely archived on these servers _YT_DEFAULT_THUMB_SERVERS = ['i.ytimg.com'] # thumbnails most likely archived on these servers
_YT_ALL_THUMB_SERVERS = orderedSet( _YT_ALL_THUMB_SERVERS = orderedSet(
@ -573,6 +714,27 @@ def _extract_metadata(self, video_id, webpage):
initial_data = self._search_json( initial_data = self._search_json(
self._YT_INITIAL_DATA_RE, webpage, 'initial data', video_id, default={}) self._YT_INITIAL_DATA_RE, webpage, 'initial data', video_id, default={})
ytcfg = {}
for j in re.findall(r'yt\.setConfig\(\s*(?P<json>{\s*(?s:.+?)\s*})\s*\);', webpage): # ~June 2010
ytcfg.update(self._parse_json(j, video_id, fatal=False, ignore_extra=True, transform_source=js_to_json, errnote='') or {})
# XXX: this also may contain a 'ptchn' key
player_config = (
self._search_json(
r'(?:yt\.playerConfig|ytplayer\.config|swfConfig)\s*=',
webpage, 'player config', video_id, default=None)
or ytcfg.get('PLAYER_CONFIG') or {})
# XXX: this may also contain a 'creator' key.
swf_args = self._search_json(r'swfArgs\s*=', webpage, 'swf config', video_id, default={})
if swf_args and not traverse_obj(player_config, ('args',)):
player_config['args'] = swf_args
if not player_response:
# April 2020
player_response = self._parse_json(
traverse_obj(player_config, ('args', 'player_response')) or '{}', video_id, fatal=False)
initial_data_video = traverse_obj( initial_data_video = traverse_obj(
initial_data, ('contents', 'twoColumnWatchNextResults', 'results', 'results', 'contents', ..., 'videoPrimaryInfoRenderer'), initial_data, ('contents', 'twoColumnWatchNextResults', 'results', 'results', 'contents', ..., 'videoPrimaryInfoRenderer'),
expected_type=dict, get_all=False, default={}) expected_type=dict, get_all=False, default={})
@ -587,21 +749,64 @@ def _extract_metadata(self, video_id, webpage):
video_details.get('title') video_details.get('title')
or YoutubeBaseInfoExtractor._get_text(microformats, 'title') or YoutubeBaseInfoExtractor._get_text(microformats, 'title')
or YoutubeBaseInfoExtractor._get_text(initial_data_video, 'title') or YoutubeBaseInfoExtractor._get_text(initial_data_video, 'title')
or traverse_obj(player_config, ('args', 'title'))
or self._extract_webpage_title(webpage) or self._extract_webpage_title(webpage)
or search_meta(['og:title', 'twitter:title', 'title'])) or search_meta(['og:title', 'twitter:title', 'title']))
def id_from_url(url, type_):
return self._search_regex(
rf'(?:{type_})/([^/#&?]+)', url or '', f'{type_} id', default=None)
# XXX: would the get_elements_by_... functions be better suited here?
_CHANNEL_URL_HREF_RE = r'href="[^"]*(?P<url>https?://www\.youtube\.com/(?:user|channel)/[^"]+)"'
uploader_or_channel_url = self._search_regex(
[fr'<(?:link\s*itemprop=\"url\"|a\s*id=\"watch-username\").*?\b{_CHANNEL_URL_HREF_RE}>', # @fd05024
fr'<div\s*id=\"(?:watch-channel-stats|watch-headline-user-info)\"[^>]*>\s*<a[^>]*\b{_CHANNEL_URL_HREF_RE}'], # ~ May 2009, ~June 2012
webpage, 'uploader or channel url', default=None)
owner_profile_url = url_or_none(microformats.get('ownerProfileUrl')) # @a6211d2
# Uploader refers to the /user/ id ONLY
uploader_id = (
id_from_url(owner_profile_url, 'user')
or id_from_url(uploader_or_channel_url, 'user')
or ytcfg.get('VIDEO_USERNAME'))
uploader_url = f'https://www.youtube.com/user/{uploader_id}' if uploader_id else None
# XXX: do we want to differentiate uploader and channel?
uploader = (
self._search_regex(
[r'<a\s*id="watch-username"[^>]*>\s*<strong>([^<]+)</strong>', # June 2010
r'var\s*watchUsername\s*=\s*\'(.+?)\';', # ~May 2009
r'<div\s*\bid=\"watch-channel-stats"[^>]*>\s*<a[^>]*>\s*(.+?)\s*</a', # ~May 2009
r'<a\s*id="watch-userbanner"[^>]*title="\s*(.+?)\s*"'], # ~June 2012
webpage, 'uploader', default=None)
or self._html_search_regex(
[r'(?s)<div\s*class="yt-user-info".*?<a[^>]*[^>]*>\s*(.*?)\s*</a', # March 2016
r'(?s)<a[^>]*yt-user-name[^>]*>\s*(.*?)\s*</a'], # july 2013
get_element_by_id('watch7-user-header', webpage), 'uploader', default=None)
or self._html_search_regex(
r'<button\s*href="/user/[^>]*>\s*<span[^>]*>\s*(.+?)\s*<', # April 2012
get_element_by_id('watch-headline-user-info', webpage), 'uploader', default=None)
or traverse_obj(player_config, ('args', 'creator'))
or video_details.get('author'))
channel_id = str_or_none( channel_id = str_or_none(
video_details.get('channelId') video_details.get('channelId')
or microformats.get('externalChannelId') or microformats.get('externalChannelId')
or search_meta('channelId') or search_meta('channelId')
or self._search_regex( or self._search_regex(
r'data-channel-external-id=(["\'])(?P<id>(?:(?!\1).)+)\1', # @b45a9e6 r'data-channel-external-id=(["\'])(?P<id>(?:(?!\1).)+)\1', # @b45a9e6
webpage, 'channel id', default=None, group='id')) webpage, 'channel id', default=None, group='id')
channel_url = f'http://www.youtube.com/channel/{channel_id}' if channel_id else None or id_from_url(owner_profile_url, 'channel')
or id_from_url(uploader_or_channel_url, 'channel')
or traverse_obj(player_config, ('args', 'ucid')))
channel_url = f'https://www.youtube.com/channel/{channel_id}' if channel_id else None
duration = int_or_none( duration = int_or_none(
video_details.get('lengthSeconds') video_details.get('lengthSeconds')
or microformats.get('lengthSeconds') or microformats.get('lengthSeconds')
or traverse_obj(player_config, ('args', ('length_seconds', 'l')), get_all=False)
or parse_duration(search_meta('duration'))) or parse_duration(search_meta('duration')))
description = ( description = (
video_details.get('shortDescription') video_details.get('shortDescription')
@ -609,26 +814,13 @@ def _extract_metadata(self, video_id, webpage):
or clean_html(get_element_by_id('eow-description', webpage)) # @9e6dd23 or clean_html(get_element_by_id('eow-description', webpage)) # @9e6dd23
or search_meta(['description', 'og:description', 'twitter:description'])) or search_meta(['description', 'og:description', 'twitter:description']))
uploader = video_details.get('author')
# Uploader ID and URL
uploader_mobj = re.search(
r'<link itemprop="url" href="(?P<uploader_url>https?://www\.youtube\.com/(?:user|channel)/(?P<uploader_id>[^"]+))">', # @fd05024
webpage)
if uploader_mobj is not None:
uploader_id, uploader_url = uploader_mobj.group('uploader_id'), uploader_mobj.group('uploader_url')
else:
# @a6211d2
uploader_url = url_or_none(microformats.get('ownerProfileUrl'))
uploader_id = self._search_regex(
r'(?:user|channel)/([^/]+)', uploader_url or '', 'uploader id', default=None)
upload_date = unified_strdate( upload_date = unified_strdate(
dict_get(microformats, ('uploadDate', 'publishDate')) dict_get(microformats, ('uploadDate', 'publishDate'))
or search_meta(['uploadDate', 'datePublished']) or search_meta(['uploadDate', 'datePublished'])
or self._search_regex( or self._search_regex(
[r'(?s)id="eow-date.*?>(.*?)</span>', [r'(?s)id="eow-date.*?>\s*(.*?)\s*</span>',
r'(?:id="watch-uploader-info".*?>.*?|["\']simpleText["\']\s*:\s*["\'])(?:Published|Uploaded|Streamed live|Started) on (.+?)[<"\']'], # @7998520 r'(?:id="watch-uploader-info".*?>.*?|["\']simpleText["\']\s*:\s*["\'])(?:Published|Uploaded|Streamed live|Started) on (.+?)[<"\']', # @7998520
r'class\s*=\s*"(?:watch-video-date|watch-video-added post-date)"[^>]*>\s*([^<]+?)\s*<'], # ~June 2010, ~Jan 2009 (respectively)
webpage, 'upload date', default=None)) webpage, 'upload date', default=None))
return { return {
@ -697,6 +889,8 @@ def _real_extract(self, url):
url_date = url_date or url_date_2 url_date = url_date or url_date_2
urlh = None urlh = None
retry_manager = self.RetryManager(fatal=False)
for retry in retry_manager:
try: try:
urlh = self._request_webpage( urlh = self._request_webpage(
HEADRequest('https://web.archive.org/web/2oe_/http://wayback-fakeurl.archive.org/yt/%s' % video_id), HEADRequest('https://web.archive.org/web/2oe_/http://wayback-fakeurl.archive.org/yt/%s' % video_id),
@ -705,10 +899,12 @@ def _real_extract(self, url):
# HTTP Error 404 is expected if the video is not saved. # HTTP Error 404 is expected if the video is not saved.
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 404: if isinstance(e.cause, compat_HTTPError) and e.cause.code == 404:
self.raise_no_formats( self.raise_no_formats(
'The requested video is not archived, indexed, or there is an issue with web.archive.org', 'The requested video is not archived, indexed, or there is an issue with web.archive.org (try again later)', expected=True)
expected=True)
else: else:
raise retry.error = e
if retry_manager.error:
self.raise_no_formats(retry_manager.error, expected=True, video_id=video_id)
capture_dates = self._get_capture_dates(video_id, int_or_none(url_date)) capture_dates = self._get_capture_dates(video_id, int_or_none(url_date))
self.write_debug('Captures to try: ' + join_nonempty(*capture_dates, delim=', ')) self.write_debug('Captures to try: ' + join_nonempty(*capture_dates, delim=', '))

View File

@ -95,24 +95,24 @@ class ArteTVIE(ArteTVBaseIE):
# all obtained by exhaustive testing # all obtained by exhaustive testing
_COUNTRIES_MAP = { _COUNTRIES_MAP = {
'DE_FR': { 'DE_FR': (
'BL', 'DE', 'FR', 'GF', 'GP', 'MF', 'MQ', 'NC', 'BL', 'DE', 'FR', 'GF', 'GP', 'MF', 'MQ', 'NC',
'PF', 'PM', 'RE', 'WF', 'YT', 'PF', 'PM', 'RE', 'WF', 'YT',
}, ),
# with both of the below 'BE' sometimes works, sometimes doesn't # with both of the below 'BE' sometimes works, sometimes doesn't
'EUR_DE_FR': { 'EUR_DE_FR': (
'AT', 'BL', 'CH', 'DE', 'FR', 'GF', 'GP', 'LI', 'AT', 'BL', 'CH', 'DE', 'FR', 'GF', 'GP', 'LI',
'MC', 'MF', 'MQ', 'NC', 'PF', 'PM', 'RE', 'WF', 'MC', 'MF', 'MQ', 'NC', 'PF', 'PM', 'RE', 'WF',
'YT', 'YT',
}, ),
'SAT': { 'SAT': (
'AD', 'AT', 'AX', 'BG', 'BL', 'CH', 'CY', 'CZ', 'AD', 'AT', 'AX', 'BG', 'BL', 'CH', 'CY', 'CZ',
'DE', 'DK', 'EE', 'ES', 'FI', 'FR', 'GB', 'GF', 'DE', 'DK', 'EE', 'ES', 'FI', 'FR', 'GB', 'GF',
'GR', 'HR', 'HU', 'IE', 'IS', 'IT', 'KN', 'LI', 'GR', 'HR', 'HU', 'IE', 'IS', 'IT', 'KN', 'LI',
'LT', 'LU', 'LV', 'MC', 'MF', 'MQ', 'MT', 'NC', 'LT', 'LU', 'LV', 'MC', 'MF', 'MQ', 'MT', 'NC',
'NL', 'NO', 'PF', 'PL', 'PM', 'PT', 'RE', 'RO', 'NL', 'NO', 'PF', 'PL', 'PM', 'PT', 'RE', 'RO',
'SE', 'SI', 'SK', 'SM', 'VA', 'WF', 'YT', 'SE', 'SI', 'SK', 'SM', 'VA', 'WF', 'YT',
}, ),
} }
def _real_extract(self, url): def _real_extract(self, url):
@ -135,6 +135,7 @@ def _real_extract(self, url):
'Video is not available in this language edition of Arte or broadcast rights expired', expected=True) 'Video is not available in this language edition of Arte or broadcast rights expired', expected=True)
formats, subtitles = [], {} formats, subtitles = [], {}
secondary_formats = []
for stream in config['data']['attributes']['streams']: for stream in config['data']['attributes']['streams']:
# official player contains code like `e.get("versions")[0].eStat.ml5` # official player contains code like `e.get("versions")[0].eStat.ml5`
stream_version = stream['versions'][0] stream_version = stream['versions'][0]
@ -152,14 +153,18 @@ def _real_extract(self, url):
not m.group('sdh_sub'), # and we prefer not the hard-of-hearing subtitles if there are subtitles not m.group('sdh_sub'), # and we prefer not the hard-of-hearing subtitles if there are subtitles
))) )))
short_label = traverse_obj(stream_version, 'shortLabel', expected_type=str, default='?')
if stream['protocol'].startswith('HLS'): if stream['protocol'].startswith('HLS'):
fmts, subs = self._extract_m3u8_formats_and_subtitles( fmts, subs = self._extract_m3u8_formats_and_subtitles(
stream['url'], video_id=video_id, ext='mp4', m3u8_id=stream_version_code, fatal=False) stream['url'], video_id=video_id, ext='mp4', m3u8_id=stream_version_code, fatal=False)
for fmt in fmts: for fmt in fmts:
fmt.update({ fmt.update({
'format_note': f'{stream_version.get("label", "unknown")} [{stream_version.get("shortLabel", "?")}]', 'format_note': f'{stream_version.get("label", "unknown")} [{short_label}]',
'language_preference': lang_pref, 'language_preference': lang_pref,
}) })
if any(map(short_label.startswith, ('cc', 'OGsub'))):
secondary_formats.extend(fmts)
else:
formats.extend(fmts) formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles) self._merge_subtitles(subs, target=subtitles)
@ -167,7 +172,7 @@ def _real_extract(self, url):
formats.append({ formats.append({
'format_id': f'{stream["protocol"]}-{stream_version_code}', 'format_id': f'{stream["protocol"]}-{stream_version_code}',
'url': stream['url'], 'url': stream['url'],
'format_note': f'{stream_version.get("label", "unknown")} [{stream_version.get("shortLabel", "?")}]', 'format_note': f'{stream_version.get("label", "unknown")} [{short_label}]',
'language_preference': lang_pref, 'language_preference': lang_pref,
# 'ext': 'mp4', # XXX: may or may not be necessary, at least for HTTPS # 'ext': 'mp4', # XXX: may or may not be necessary, at least for HTTPS
}) })
@ -179,6 +184,8 @@ def _real_extract(self, url):
# The JS also looks for chapters in config['data']['attributes']['chapters'], # The JS also looks for chapters in config['data']['attributes']['chapters'],
# but I am yet to find a video having those # but I am yet to find a video having those
formats.extend(secondary_formats)
self._remove_duplicate_formats(formats)
self._sort_formats(formats) self._sort_formats(formats)
metadata = config['data']['attributes']['metadata'] metadata = config['data']['attributes']['metadata']

View File

@ -1,24 +1,33 @@
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import clean_html, float_or_none, traverse_obj, unescapeHTML
clean_html,
float_or_none,
)
class AudioBoomIE(InfoExtractor): class AudioBoomIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?audioboom\.com/(?:boos|posts)/(?P<id>[0-9]+)' _VALID_URL = r'https?://(?:www\.)?audioboom\.com/(?:boos|posts)/(?P<id>[0-9]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://audioboom.com/posts/7398103-asim-chaudhry', 'url': 'https://audioboom.com/posts/7398103-asim-chaudhry',
'md5': '7b00192e593ff227e6a315486979a42d', 'md5': '4d68be11c9f9daf3dab0778ad1e010c3',
'info_dict': { 'info_dict': {
'id': '7398103', 'id': '7398103',
'ext': 'mp3', 'ext': 'mp3',
'title': 'Asim Chaudhry', 'title': 'Asim Chaudhry',
'description': 'md5:2f3fef17dacc2595b5362e1d7d3602fc', 'description': 'md5:0ed714ae0e81e5d9119cac2f618ad679',
'duration': 4000.99, 'duration': 4000.99,
'uploader': 'Sue Perkins: An hour or so with...', 'uploader': 'Sue Perkins: An hour or so with...',
'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channel/perkins', 'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channel/perkins',
} }
}, { # Direct mp3-file link
'url': 'https://audioboom.com/posts/8128496.mp3',
'md5': 'e329edf304d450def95c7f86a9165ee1',
'info_dict': {
'id': '8128496',
'ext': 'mp3',
'title': 'TCRNo8 / DAILY 03 - In Control',
'description': 'md5:44665f142db74858dfa21c5b34787948',
'duration': 1689.7,
'uploader': 'Lost Dot Podcast: The Trans Pyrenees and Transcontinental Race',
'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channels/5003904',
}
}, { }, {
'url': 'https://audioboom.com/posts/4279833-3-09-2016-czaban-hour-3?t=0', 'url': 'https://audioboom.com/posts/4279833-3-09-2016-czaban-hour-3?t=0',
'only_matching': True, 'only_matching': True,
@ -26,45 +35,23 @@ class AudioBoomIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(f'https://audioboom.com/posts/{video_id}', video_id)
webpage = self._download_webpage(url, video_id) clip_store = self._search_json(
r'data-react-class="V5DetailPagePlayer"\s*data-react-props=["\']',
clip = None webpage, 'clip store', video_id, fatal=False, transform_source=unescapeHTML)
clip = traverse_obj(clip_store, ('clips', 0), expected_type=dict) or {}
clip_store = self._parse_json(
self._html_search_regex(
r'data-new-clip-store=(["\'])(?P<json>{.+?})\1',
webpage, 'clip store', default='{}', group='json'),
video_id, fatal=False)
if clip_store:
clips = clip_store.get('clips')
if clips and isinstance(clips, list) and isinstance(clips[0], dict):
clip = clips[0]
def from_clip(field):
if clip:
return clip.get(field)
audio_url = from_clip('clipURLPriorToLoading') or self._og_search_property(
'audio', webpage, 'audio url')
title = from_clip('title') or self._html_search_meta(
['og:title', 'og:audio:title', 'audio_title'], webpage)
description = from_clip('description') or clean_html(from_clip('formattedDescription')) or self._og_search_description(webpage)
duration = float_or_none(from_clip('duration') or self._html_search_meta(
'weibo:audio:duration', webpage))
uploader = from_clip('author') or self._html_search_meta(
['og:audio:artist', 'twitter:audio:artist_name', 'audio_artist'], webpage, 'uploader')
uploader_url = from_clip('author_url') or self._html_search_meta(
'audioboo:channel', webpage, 'uploader url')
return { return {
'id': video_id, 'id': video_id,
'url': audio_url, 'url': clip.get('clipURLPriorToLoading') or self._og_search_property('audio', webpage, 'audio url'),
'title': title, 'title': clip.get('title') or self._html_search_meta(['og:title', 'og:audio:title', 'audio_title'], webpage),
'description': description, 'description': (clip.get('description') or clean_html(clip.get('formattedDescription'))
'duration': duration, or self._og_search_description(webpage)),
'uploader': uploader, 'duration': float_or_none(clip.get('duration') or self._html_search_meta('weibo:audio:duration', webpage)),
'uploader_url': uploader_url, 'uploader': clip.get('author') or self._html_search_meta(
['og:audio:artist', 'twitter:audio:artist_name', 'audio_artist'], webpage, 'uploader'),
'uploader_url': clip.get('author_url') or self._html_search_regex(
r'<div class="avatar flex-shrink-0">\s*<a href="(?P<uploader_url>http[^"]+)"',
webpage, 'uploader url', fatal=False),
} }

View File

@ -5,23 +5,23 @@
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str from ..compat import compat_str
from ..utils import ( from ..utils import (
KNOWN_EXTENSIONS,
ExtractorError, ExtractorError,
float_or_none, float_or_none,
int_or_none, int_or_none,
KNOWN_EXTENSIONS,
parse_filesize, parse_filesize,
str_or_none, str_or_none,
try_get, try_get,
update_url_query,
unified_strdate, unified_strdate,
unified_timestamp, unified_timestamp,
update_url_query,
url_or_none, url_or_none,
urljoin, urljoin,
) )
class BandcampIE(InfoExtractor): class BandcampIE(InfoExtractor):
_VALID_URL = r'https?://[^/]+\.bandcamp\.com/track/(?P<id>[^/?#&]+)' _VALID_URL = r'https?://(?P<uploader>[^/]+)\.bandcamp\.com/track/(?P<id>[^/?#&]+)'
_EMBED_REGEX = [r'<meta property="og:url"[^>]*?content="(?P<url>.*?bandcamp\.com.*?)"'] _EMBED_REGEX = [r'<meta property="og:url"[^>]*?content="(?P<url>.*?bandcamp\.com.*?)"']
_TESTS = [{ _TESTS = [{
'url': 'http://youtube-dl.bandcamp.com/track/youtube-dl-test-song', 'url': 'http://youtube-dl.bandcamp.com/track/youtube-dl-test-song',
@ -85,7 +85,7 @@ def _extract_data_attr(self, webpage, video_id, attr='tralbum', fatal=True):
attr + ' data', group=2), video_id, fatal=fatal) attr + ' data', group=2), video_id, fatal=fatal)
def _real_extract(self, url): def _real_extract(self, url):
title = self._match_id(url) title, uploader = self._match_valid_url(url).group('id', 'uploader')
webpage = self._download_webpage(url, title) webpage = self._download_webpage(url, title)
tralbum = self._extract_data_attr(webpage, title) tralbum = self._extract_data_attr(webpage, title)
thumbnail = self._og_search_thumbnail(webpage) thumbnail = self._og_search_thumbnail(webpage)
@ -197,6 +197,8 @@ def _real_extract(self, url):
'title': title, 'title': title,
'thumbnail': thumbnail, 'thumbnail': thumbnail,
'uploader': artist, 'uploader': artist,
'uploader_id': uploader,
'uploader_url': f'https://{uploader}.bandcamp.com',
'timestamp': timestamp, 'timestamp': timestamp,
'release_timestamp': unified_timestamp(tralbum.get('album_release_date')), 'release_timestamp': unified_timestamp(tralbum.get('album_release_date')),
'duration': duration, 'duration': duration,

View File

@ -0,0 +1,70 @@
from .common import InfoExtractor
from ..utils import float_or_none, mimetype2ext, traverse_obj
class BerufeTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?web\.arbeitsagentur\.de/berufetv/[^?#]+/film;filmId=(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://web.arbeitsagentur.de/berufetv/studienberufe/wirtschaftswissenschaften/wirtschaftswissenschaften-volkswirtschaft/film;filmId=DvKC3DUpMKvUZ_6fEnfg3u',
'md5': '041b6432ec8e6838f84a5c30f31cc795',
'info_dict': {
'id': 'DvKC3DUpMKvUZ_6fEnfg3u',
'ext': 'mp4',
'title': 'Volkswirtschaftslehre',
'description': 'md5:6bd87d0c63163480a6489a37526ee1c1',
'categories': ['Studien&shy;beruf'],
'tags': ['Studienfilm'],
'duration': 602.440,
'thumbnail': r're:^https://asset-out-cdn\.video-cdn\.net/private/videos/DvKC3DUpMKvUZ_6fEnfg3u/thumbnails/793063\?quality=thumbnail&__token__=[^\s]+$',
}
}]
def _real_extract(self, url):
video_id = self._match_id(url)
movie_metadata = self._download_json(
'https://rest.arbeitsagentur.de/infosysbub/berufetv/pc/v1/film-metadata',
video_id, 'Downloading JSON metadata',
headers={'X-API-Key': '79089773-4892-4386-86e6-e8503669f426'}, fatal=False)
meta = traverse_obj(
movie_metadata, ('metadaten', lambda _, i: video_id == i['miId']),
get_all=False, default={})
video = self._download_json(
f'https://d.video-cdn.net/play/player/8YRzUk6pTzmBdrsLe9Y88W/video/{video_id}',
video_id, 'Downloading video JSON')
formats, subtitles = [], {}
for key, source in video['videoSources']['html'].items():
if key == 'auto':
fmts, subs = self._extract_m3u8_formats_and_subtitles(source[0]['source'], video_id)
formats += fmts
subtitles = subs
else:
formats.append({
'url': source[0]['source'],
'ext': mimetype2ext(source[0]['mimeType']),
'format_id': key,
})
for track in video.get('videoTracks') or []:
if track.get('type') != 'SUBTITLES':
continue
subtitles.setdefault(track['language'], []).append({
'url': track['source'],
'name': track.get('label'),
'ext': 'vtt'
})
return {
'id': video_id,
'title': meta.get('titel') or traverse_obj(video, ('videoMetaData', 'title')),
'description': meta.get('beschreibung'),
'thumbnail': meta.get('thumbnail') or f'https://asset-out-cdn.video-cdn.net/private/videos/{video_id}/thumbnails/active',
'duration': float_or_none(video.get('duration'), scale=1000),
'categories': [meta['kategorie']] if meta.get('kategorie') else None,
'tags': meta.get('themengebiete'),
'subtitles': subtitles,
'formats': formats,
}

View File

@ -2,8 +2,9 @@
import hashlib import hashlib
import itertools import itertools
import functools import functools
import re
import math import math
import re
import urllib
from .common import InfoExtractor, SearchInfoExtractor from .common import InfoExtractor, SearchInfoExtractor
from ..compat import ( from ..compat import (
@ -13,23 +14,24 @@
) )
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
InAdvancePagedList,
OnDemandPagedList,
filter_dict, filter_dict,
int_or_none,
float_or_none, float_or_none,
int_or_none,
mimetype2ext, mimetype2ext,
parse_count,
parse_iso8601, parse_iso8601,
qualities, qualities,
traverse_obj,
parse_count,
smuggle_url, smuggle_url,
srt_subtitles_timecode, srt_subtitles_timecode,
str_or_none, str_or_none,
strip_jsonp, strip_jsonp,
traverse_obj,
unified_timestamp, unified_timestamp,
unsmuggle_url, unsmuggle_url,
urlencode_postdata, urlencode_postdata,
url_or_none, url_or_none,
OnDemandPagedList
) )
@ -218,6 +220,9 @@ def _real_extract(self, url):
durl = traverse_obj(video_info, ('dash', 'video')) durl = traverse_obj(video_info, ('dash', 'video'))
audios = traverse_obj(video_info, ('dash', 'audio')) or [] audios = traverse_obj(video_info, ('dash', 'audio')) or []
flac_audio = traverse_obj(video_info, ('dash', 'flac', 'audio'))
if flac_audio:
audios.append(flac_audio)
entries = [] entries = []
RENDITIONS = ('qn=80&quality=80&type=', 'quality=2&type=mp4') RENDITIONS = ('qn=80&quality=80&type=', 'quality=2&type=mp4')
@ -502,39 +507,135 @@ def _real_extract(self, url):
season_info.get('bangumi_title'), season_info.get('evaluate')) season_info.get('bangumi_title'), season_info.get('evaluate'))
class BilibiliChannelIE(InfoExtractor): class BilibiliSpaceBaseIE(InfoExtractor):
_VALID_URL = r'https?://space.bilibili\.com/(?P<id>\d+)' def _extract_playlist(self, fetch_page, get_metadata, get_entries):
_API_URL = "https://api.bilibili.com/x/space/arc/search?mid=%s&pn=%d&jsonp=jsonp" first_page = fetch_page(0)
metadata = get_metadata(first_page)
paged_list = InAdvancePagedList(
lambda idx: get_entries(fetch_page(idx) if idx else first_page),
metadata['page_count'], metadata['page_size'])
return metadata, paged_list
class BilibiliSpaceVideoIE(BilibiliSpaceBaseIE):
_VALID_URL = r'https?://space\.bilibili\.com/(?P<id>\d+)(?P<video>/video)?/?(?:[?#]|$)'
_TESTS = [{ _TESTS = [{
'url': 'https://space.bilibili.com/3985676/video', 'url': 'https://space.bilibili.com/3985676/video',
'info_dict': {}, 'info_dict': {
'playlist_mincount': 112, 'id': '3985676',
},
'playlist_mincount': 178,
}] }]
def _entries(self, list_id): def _real_extract(self, url):
count, max_count = 0, None playlist_id, is_video_url = self._match_valid_url(url).group('id', 'video')
if not is_video_url:
self.to_screen('A channel URL was given. Only the channel\'s videos will be downloaded. '
'To download audios, add a "/audio" to the URL')
for page_num in itertools.count(1): def fetch_page(page_idx):
data = self._download_json( try:
self._API_URL % (list_id, page_num), list_id, note=f'Downloading page {page_num}')['data'] response = self._download_json('https://api.bilibili.com/x/space/arc/search',
playlist_id, note=f'Downloading page {page_idx}',
query={'mid': playlist_id, 'pn': page_idx + 1, 'jsonp': 'jsonp'})
except ExtractorError as e:
if isinstance(e.cause, urllib.error.HTTPError) and e.cause.code == 412:
raise ExtractorError(
'Request is blocked by server (412), please add cookies, wait and try later.', expected=True)
raise
if response['code'] == -401:
raise ExtractorError(
'Request is blocked by server (401), please add cookies, wait and try later.', expected=True)
return response['data']
max_count = max_count or traverse_obj(data, ('page', 'count')) def get_metadata(page_data):
page_size = page_data['page']['ps']
entry_count = page_data['page']['count']
return {
'page_count': math.ceil(entry_count / page_size),
'page_size': page_size,
}
entries = traverse_obj(data, ('list', 'vlist')) def get_entries(page_data):
if not entries: for entry in traverse_obj(page_data, ('list', 'vlist')) or []:
return yield self.url_result(f'https://www.bilibili.com/video/{entry["bvid"]}', BiliBiliIE, entry['bvid'])
for entry in entries:
yield self.url_result(
'https://www.bilibili.com/video/%s' % entry['bvid'],
BiliBiliIE.ie_key(), entry['bvid'])
count += len(entries) metadata, paged_list = self._extract_playlist(fetch_page, get_metadata, get_entries)
if max_count and count >= max_count: return self.playlist_result(paged_list, playlist_id)
return
class BilibiliSpaceAudioIE(BilibiliSpaceBaseIE):
_VALID_URL = r'https?://space\.bilibili\.com/(?P<id>\d+)/audio'
_TESTS = [{
'url': 'https://space.bilibili.com/3985676/audio',
'info_dict': {
'id': '3985676',
},
'playlist_mincount': 1,
}]
def _real_extract(self, url): def _real_extract(self, url):
list_id = self._match_id(url) playlist_id = self._match_id(url)
return self.playlist_result(self._entries(list_id), list_id)
def fetch_page(page_idx):
return self._download_json(
'https://api.bilibili.com/audio/music-service/web/song/upper', playlist_id,
note=f'Downloading page {page_idx}',
query={'uid': playlist_id, 'pn': page_idx + 1, 'ps': 30, 'order': 1, 'jsonp': 'jsonp'})['data']
def get_metadata(page_data):
return {
'page_count': page_data['pageCount'],
'page_size': page_data['pageSize'],
}
def get_entries(page_data):
for entry in page_data.get('data', []):
yield self.url_result(f'https://www.bilibili.com/audio/au{entry["id"]}', BilibiliAudioIE, entry['id'])
metadata, paged_list = self._extract_playlist(fetch_page, get_metadata, get_entries)
return self.playlist_result(paged_list, playlist_id)
class BilibiliSpacePlaylistIE(BilibiliSpaceBaseIE):
_VALID_URL = r'https?://space.bilibili\.com/(?P<mid>\d+)/channel/collectiondetail\?sid=(?P<sid>\d+)'
_TESTS = [{
'url': 'https://space.bilibili.com/2142762/channel/collectiondetail?sid=57445',
'info_dict': {
'id': '2142762_57445',
'title': '《底特律 变人》'
},
'playlist_mincount': 31,
}]
def _real_extract(self, url):
mid, sid = self._match_valid_url(url).group('mid', 'sid')
playlist_id = f'{mid}_{sid}'
def fetch_page(page_idx):
return self._download_json(
'https://api.bilibili.com/x/polymer/space/seasons_archives_list',
playlist_id, note=f'Downloading page {page_idx}',
query={'mid': mid, 'season_id': sid, 'page_num': page_idx + 1, 'page_size': 30})['data']
def get_metadata(page_data):
page_size = page_data['page']['page_size']
entry_count = page_data['page']['total']
return {
'page_count': math.ceil(entry_count / page_size),
'page_size': page_size,
'title': traverse_obj(page_data, ('meta', 'name'))
}
def get_entries(page_data):
for entry in page_data.get('archives', []):
yield self.url_result(f'https://www.bilibili.com/video/{entry["bvid"]}',
BiliBiliIE, entry['bvid'])
metadata, paged_list = self._extract_playlist(fetch_page, get_metadata, get_entries)
return self.playlist_result(paged_list, playlist_id, metadata['title'])
class BilibiliCategoryIE(InfoExtractor): class BilibiliCategoryIE(InfoExtractor):
@ -620,14 +721,15 @@ def _search_results(self, query):
'keyword': query, 'keyword': query,
'page': page_num, 'page': page_num,
'context': '', 'context': '',
'order': 'pubdate',
'duration': 0, 'duration': 0,
'tids_2': '', 'tids_2': '',
'__refresh__': 'true', '__refresh__': 'true',
'search_type': 'video', 'search_type': 'video',
'tids': 0, 'tids': 0,
'highlight': 1, 'highlight': 1,
})['data'].get('result') or [] })['data'].get('result')
if not videos:
break
for video in videos: for video in videos:
yield self.url_result(video['arcurl'], 'BiliBili', str(video['aid'])) yield self.url_result(video['arcurl'], 'BiliBili', str(video['aid']))
@ -905,7 +1007,7 @@ def _perform_login(self, username, password):
class BiliIntlIE(BiliIntlBaseIE): class BiliIntlIE(BiliIntlBaseIE):
_VALID_URL = r'https?://(?:www\.)?bili(?:bili\.tv|intl\.com)/(?:[a-z]{2}/)?(play/(?P<season_id>\d+)/(?P<ep_id>\d+)|video/(?P<aid>\d+))' _VALID_URL = r'https?://(?:www\.)?bili(?:bili\.tv|intl\.com)/(?:[a-zA-Z]{2}/)?(play/(?P<season_id>\d+)/(?P<ep_id>\d+)|video/(?P<aid>\d+))'
_TESTS = [{ _TESTS = [{
# Bstation page # Bstation page
'url': 'https://www.bilibili.tv/en/play/34613/341736', 'url': 'https://www.bilibili.tv/en/play/34613/341736',
@ -948,6 +1050,10 @@ class BiliIntlIE(BiliIntlBaseIE):
# No language in URL # No language in URL
'url': 'https://www.bilibili.tv/video/2019955076', 'url': 'https://www.bilibili.tv/video/2019955076',
'only_matching': True, 'only_matching': True,
}, {
# Uppercase language in URL
'url': 'https://www.bilibili.tv/EN/video/2019955076',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@ -971,7 +1077,7 @@ def _real_extract(self, url):
class BiliIntlSeriesIE(BiliIntlBaseIE): class BiliIntlSeriesIE(BiliIntlBaseIE):
_VALID_URL = r'https?://(?:www\.)?bili(?:bili\.tv|intl\.com)/(?:[a-z]{2}/)?play/(?P<id>\d+)$' _VALID_URL = r'https?://(?:www\.)?bili(?:bili\.tv|intl\.com)/(?:[a-zA-Z]{2}/)?play/(?P<id>\d+)/?(?:[?#]|$)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.bilibili.tv/en/play/34613', 'url': 'https://www.bilibili.tv/en/play/34613',
'playlist_mincount': 15, 'playlist_mincount': 15,
@ -989,6 +1095,9 @@ class BiliIntlSeriesIE(BiliIntlBaseIE):
}, { }, {
'url': 'https://www.biliintl.com/en/play/34613', 'url': 'https://www.biliintl.com/en/play/34613',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.biliintl.com/EN/play/34613',
'only_matching': True,
}] }]
def _entries(self, series_id): def _entries(self, series_id):

View File

@ -65,10 +65,12 @@ def _real_extract(self, url):
error = self._html_search_regex(r'<h1 class="page-title">([^<]+)</h1>', webpage, 'error', default='Cannot find video') error = self._html_search_regex(r'<h1 class="page-title">([^<]+)</h1>', webpage, 'error', default='Cannot find video')
if error == 'Video Unavailable': if error == 'Video Unavailable':
raise GeoRestrictedError(error) raise GeoRestrictedError(error)
raise ExtractorError(error) raise ExtractorError(error, expected=True)
formats = entries[0]['formats'] formats = entries[0]['formats']
self._check_formats(formats, video_id) self._check_formats(formats, video_id)
if not formats:
raise self.raise_no_formats('Video is unavailable', expected=True, video_id=video_id)
self._sort_formats(formats) self._sort_formats(formats)
description = self._html_search_regex( description = self._html_search_regex(

View File

@ -8,13 +8,28 @@
class BongaCamsIE(InfoExtractor): class BongaCamsIE(InfoExtractor):
_VALID_URL = r'https?://(?P<host>(?:[^/]+\.)?bongacams\d*\.com)/(?P<id>[^/?&#]+)' _VALID_URL = r'https?://(?P<host>(?:[^/]+\.)?bongacams\d*\.(?:com|net))/(?P<id>[^/?&#]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://de.bongacams.com/azumi-8', 'url': 'https://de.bongacams.com/azumi-8',
'only_matching': True, 'only_matching': True,
}, { }, {
'url': 'https://cn.bongacams.com/azumi-8', 'url': 'https://cn.bongacams.com/azumi-8',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://de.bongacams.net/claireashton',
'info_dict': {
'id': 'claireashton',
'ext': 'mp4',
'title': r're:ClaireAshton \d{4}-\d{2}-\d{2} \d{2}:\d{2}',
'age_limit': 18,
'uploader_id': 'ClaireAshton',
'uploader': 'ClaireAshton',
'like_count': int,
'is_live': True,
},
'params': {
'skip_download': True,
},
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -0,0 +1,87 @@
from .common import InfoExtractor
from ..utils import int_or_none, str_or_none, traverse_obj
class BooyahBaseIE(InfoExtractor):
_BOOYAH_SESSION_KEY = None
def _real_initialize(self):
BooyahBaseIE._BOOYAH_SESSION_KEY = self._request_webpage(
'https://booyah.live/api/v3/auths/sessions', None, data=b'').getheader('booyah-session-key')
def _get_comments(self, video_id):
comment_json = self._download_json(
f'https://booyah.live/api/v3/playbacks/{video_id}/comments/tops', video_id,
headers={'Booyah-Session-Key': self._BOOYAH_SESSION_KEY}, fatal=False) or {}
return [{
'id': comment.get('comment_id'),
'author': comment.get('from_nickname'),
'author_id': comment.get('from_uid'),
'author_thumbnail': comment.get('from_thumbnail'),
'text': comment.get('content'),
'timestamp': comment.get('create_time'),
'like_count': comment.get('like_cnt'),
} for comment in comment_json.get('comment_list') or ()]
class BooyahClipsIE(BooyahBaseIE):
_VALID_URL = r'https?://booyah.live/clips/(?P<id>\d+)'
_TESTS = [{
'url': 'https://booyah.live/clips/13887261322952306617',
'info_dict': {
'id': '13887261322952306617',
'ext': 'mp4',
'view_count': int,
'duration': 30,
'channel_id': 90565760,
'like_count': int,
'title': 'Cayendo con estilo 😎',
'uploader': '♡LɪMER',
'comment_count': int,
'uploader_id': '90565760',
'thumbnail': 'https://resmambet-a.akamaihd.net/mambet-storage/Clip/90565760/90565760-27204374-fba0-409d-9d7b-63a48b5c0e75.jpg',
'upload_date': '20220617',
'timestamp': 1655490556,
'modified_timestamp': 1655490556,
'modified_date': '20220617',
}
}]
def _real_extract(self, url):
video_id = self._match_id(url)
json_data = self._download_json(
f'https://booyah.live/api/v3/playbacks/{video_id}', video_id,
headers={'Booyah-Session-key': self._BOOYAH_SESSION_KEY})
formats = []
for video_data in json_data['playback']['endpoint_list']:
formats.extend(({
'url': video_data.get('stream_url'),
'ext': 'mp4',
'height': video_data.get('resolution'),
}, {
'url': video_data.get('download_url'),
'ext': 'mp4',
'format_note': 'Watermarked',
'height': video_data.get('resolution'),
'preference': -10,
}))
self._sort_formats(formats)
return {
'id': video_id,
'title': traverse_obj(json_data, ('playback', 'name')),
'thumbnail': traverse_obj(json_data, ('playback', 'thumbnail_url')),
'formats': formats,
'view_count': traverse_obj(json_data, ('playback', 'views')),
'like_count': traverse_obj(json_data, ('playback', 'likes')),
'duration': traverse_obj(json_data, ('playback', 'duration')),
'comment_count': traverse_obj(json_data, ('playback', 'comment_cnt')),
'channel_id': traverse_obj(json_data, ('playback', 'channel_id')),
'uploader': traverse_obj(json_data, ('user', 'nickname')),
'uploader_id': str_or_none(traverse_obj(json_data, ('user', 'uid'))),
'modified_timestamp': int_or_none(traverse_obj(json_data, ('playback', 'update_time_ms')), 1000),
'timestamp': int_or_none(traverse_obj(json_data, ('playback', 'create_time_ms')), 1000),
'__post_extractor': self.extract_comments(video_id, self._get_comments(video_id)),
}

View File

@ -0,0 +1,34 @@
from .common import InfoExtractor
from .jwplatform import JWPlatformIE
class BundesligaIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?bundesliga\.com/[a-z]{2}/bundesliga/videos(?:/[^?]+)?\?vid=(?P<id>[a-zA-Z0-9]{8})'
_TESTS = [
{
'url': 'https://www.bundesliga.com/en/bundesliga/videos?vid=bhhHkKyN',
'md5': '8fc3b25cd12440e3a8cdc51f1493849c',
'info_dict': {
'id': 'bhhHkKyN',
'ext': 'mp4',
'title': 'Watch: Alphonso Davies and Jeremie Frimpong head-to-head',
'thumbnail': 'https://cdn.jwplayer.com/v2/media/bhhHkKyN/poster.jpg?width=720',
'upload_date': '20220928',
'duration': 146,
'timestamp': 1664366511,
'description': 'md5:803d4411bd134140c774021dd4b7598b'
}
},
{
'url': 'https://www.bundesliga.com/en/bundesliga/videos/latest-features/T8IKc8TX?vid=ROHjs06G',
'only_matching': True
},
{
'url': 'https://www.bundesliga.com/en/bundesliga/videos/goals?vid=mOG56vWA',
'only_matching': True
}
]
def _real_extract(self, url):
video_id = self._match_id(url)
return self.url_result(f'jwplatform:{video_id}', JWPlatformIE, video_id)

View File

@ -1,4 +1,8 @@
import base64
import codecs import codecs
import datetime
import hashlib
import hmac
import json import json
import re import re
@ -12,6 +16,8 @@
multipart_encode, multipart_encode,
parse_duration, parse_duration,
random_birthday, random_birthday,
traverse_obj,
try_call,
try_get, try_get,
urljoin, urljoin,
) )
@ -19,7 +25,18 @@
class CDAIE(InfoExtractor): class CDAIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www\.)?cda\.pl/video|ebd\.cda\.pl/[0-9]+x[0-9]+)/(?P<id>[0-9a-z]+)' _VALID_URL = r'https?://(?:(?:www\.)?cda\.pl/video|ebd\.cda\.pl/[0-9]+x[0-9]+)/(?P<id>[0-9a-z]+)'
_NETRC_MACHINE = 'cdapl'
_BASE_URL = 'http://www.cda.pl/' _BASE_URL = 'http://www.cda.pl/'
_BASE_API_URL = 'https://api.cda.pl'
_API_HEADERS = {
'Accept': 'application/vnd.cda.public+json',
'User-Agent': 'pl.cda 1.0 (version 1.2.88 build 15306; Android 9; Xiaomi Redmi 3S)',
}
# hardcoded in the app
_LOGIN_REQUEST_AUTH = 'Basic YzU3YzBlZDUtYTIzOC00MWQwLWI2NjQtNmZmMWMxY2Y2YzVlOklBTm95QlhRRVR6U09MV1hnV3MwMW0xT2VyNWJNZzV4clRNTXhpNGZJUGVGZ0lWUlo5UGVYTDhtUGZaR1U1U3Q'
_BEARER_CACHE = 'cda-bearer'
_TESTS = [{ _TESTS = [{
'url': 'http://www.cda.pl/video/5749950c', 'url': 'http://www.cda.pl/video/5749950c',
'md5': '6f844bf51b15f31fae165365707ae970', 'md5': '6f844bf51b15f31fae165365707ae970',
@ -83,8 +100,73 @@ def _download_age_confirm_page(self, url, video_id, *args, **kwargs):
'Content-Type': content_type, 'Content-Type': content_type,
}, **kwargs) }, **kwargs)
def _perform_login(self, username, password):
cached_bearer = self.cache.load(self._BEARER_CACHE, username) or {}
if cached_bearer.get('valid_until', 0) > datetime.datetime.now().timestamp() + 5:
self._API_HEADERS['Authorization'] = f'Bearer {cached_bearer["token"]}'
return
password_hash = base64.urlsafe_b64encode(hmac.new(
b's01m1Oer5IANoyBXQETzSOLWXgWs01m1Oer5bMg5xrTMMxRZ9Pi4fIPeFgIVRZ9PeXL8mPfXQETZGUAN5StRZ9P',
''.join(f'{bytes((bt & 255, )).hex():0>2}'
for bt in hashlib.md5(password.encode()).digest()).encode(),
hashlib.sha256).digest()).decode().replace('=', '')
token_res = self._download_json(
f'{self._BASE_API_URL}/oauth/token', None, 'Logging in', data=b'',
headers={**self._API_HEADERS, 'Authorization': self._LOGIN_REQUEST_AUTH},
query={
'grant_type': 'password',
'login': username,
'password': password_hash,
})
self.cache.store(self._BEARER_CACHE, username, {
'token': token_res['access_token'],
'valid_until': token_res['expires_in'] + datetime.datetime.now().timestamp(),
})
self._API_HEADERS['Authorization'] = f'Bearer {token_res["access_token"]}'
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
if 'Authorization' in self._API_HEADERS:
return self._api_extract(video_id)
else:
return self._web_extract(video_id, url)
def _api_extract(self, video_id):
meta = self._download_json(
f'{self._BASE_API_URL}/video/{video_id}', video_id, headers=self._API_HEADERS)['video']
if meta.get('premium') and not meta.get('premium_free'):
self.report_drm(video_id)
uploader = traverse_obj(meta, 'author', 'login')
formats = [{
'url': quality['file'],
'format': quality.get('title'),
'resolution': quality.get('name'),
'height': try_call(lambda: int(quality['name'][:-1])),
'filesize': quality.get('length'),
} for quality in meta['qualities'] if quality.get('file')]
self._sort_formats(formats)
return {
'id': video_id,
'title': meta.get('title'),
'description': meta.get('description'),
'uploader': None if uploader == 'anonim' else uploader,
'average_rating': float_or_none(meta.get('rating')),
'thumbnail': meta.get('thumb'),
'formats': formats,
'duration': meta.get('duration'),
'age_limit': 18 if meta.get('for_adults') else 0,
'view_count': meta.get('views'),
}
def _web_extract(self, video_id, url):
self._set_cookie('cda.pl', 'cda.player', 'html5') self._set_cookie('cda.pl', 'cda.player', 'html5')
webpage = self._download_webpage( webpage = self._download_webpage(
self._BASE_URL + '/video/' + video_id, video_id) self._BASE_URL + '/video/' + video_id, video_id)

View File

@ -1,6 +1,6 @@
from .common import InfoExtractor from .common import InfoExtractor
from .turner import TurnerBaseIE from .turner import TurnerBaseIE
from ..utils import url_basename from ..utils import merge_dicts, try_call, url_basename
class CNNIE(TurnerBaseIE): class CNNIE(TurnerBaseIE):
@ -141,3 +141,58 @@ def _real_extract(self, url):
webpage = self._download_webpage(url, url_basename(url)) webpage = self._download_webpage(url, url_basename(url))
cnn_url = self._html_search_regex(r"video:\s*'([^']+)'", webpage, 'cnn url') cnn_url = self._html_search_regex(r"video:\s*'([^']+)'", webpage, 'cnn url')
return self.url_result('http://cnn.com/video/?/video/' + cnn_url, CNNIE.ie_key()) return self.url_result('http://cnn.com/video/?/video/' + cnn_url, CNNIE.ie_key())
class CNNIndonesiaIE(InfoExtractor):
_VALID_URL = r'https?://www\.cnnindonesia\.com/[\w-]+/(?P<upload_date>\d{8})\d+-\d+-(?P<id>\d+)/(?P<display_id>[\w-]+)'
_TESTS = [{
'url': 'https://www.cnnindonesia.com/ekonomi/20220909212635-89-845885/alasan-harga-bbm-di-indonesia-masih-disubsidi',
'info_dict': {
'id': '845885',
'ext': 'mp4',
'description': 'md5:e7954bfa6f1749bc9ef0c079a719c347',
'upload_date': '20220909',
'title': 'Alasan Harga BBM di Indonesia Masih Disubsidi',
'timestamp': 1662859088,
'duration': 120.0,
'thumbnail': r're:https://akcdn\.detik\.net\.id/visual/2022/09/09/thumbnail-ekopedia-alasan-harga-bbm-disubsidi_169\.jpeg',
'tags': ['ekopedia', 'subsidi bbm', 'subsidi', 'bbm', 'bbm subsidi', 'harga pertalite naik'],
'age_limit': 0,
'release_timestamp': 1662859088,
'release_date': '20220911',
'uploader': 'Asfahan Yahsyi',
}
}, {
'url': 'https://www.cnnindonesia.com/internasional/20220911104341-139-846189/video-momen-charles-disambut-meriah-usai-dilantik-jadi-raja-inggris',
'info_dict': {
'id': '846189',
'ext': 'mp4',
'upload_date': '20220911',
'duration': 76.0,
'timestamp': 1662869995,
'description': 'md5:ece7b003b3ee7d81c6a5cfede7d5397d',
'thumbnail': r're:https://akcdn\.detik\.net\.id/visual/2022/09/11/thumbnail-video-1_169\.jpeg',
'title': 'VIDEO: Momen Charles Disambut Meriah usai Dilantik jadi Raja Inggris',
'tags': ['raja charles', 'raja charles iii', 'ratu elizabeth', 'ratu elizabeth meninggal dunia', 'raja inggris', 'inggris'],
'age_limit': 0,
'release_date': '20220911',
'uploader': 'REUTERS',
'release_timestamp': 1662869995,
}
}]
def _real_extract(self, url):
upload_date, video_id, display_id = self._match_valid_url(url).group('upload_date', 'id', 'display_id')
webpage = self._download_webpage(url, display_id)
json_ld_list = list(self._yield_json_ld(webpage, display_id))
json_ld_data = self._json_ld(json_ld_list, display_id)
embed_url = next(
json_ld.get('embedUrl') for json_ld in json_ld_list if json_ld.get('@type') == 'VideoObject')
return merge_dicts(json_ld_data, {
'_type': 'url_transparent',
'url': embed_url,
'upload_date': upload_date,
'tags': try_call(lambda: self._html_search_meta('keywords', webpage).split(', '))
})

View File

@ -5,6 +5,7 @@
import http.client import http.client
import http.cookiejar import http.cookiejar
import http.cookies import http.cookies
import inspect
import itertools import itertools
import json import json
import math import math
@ -21,6 +22,7 @@
from ..compat import functools # isort: split from ..compat import functools # isort: split
from ..compat import compat_etree_fromstring, compat_expanduser, compat_os_name from ..compat import compat_etree_fromstring, compat_expanduser, compat_os_name
from ..cookies import LenientSimpleCookie
from ..downloader import FileDownloader from ..downloader import FileDownloader
from ..downloader.f4m import get_base_url, remove_encrypted_media from ..downloader.f4m import get_base_url, remove_encrypted_media
from ..utils import ( from ..utils import (
@ -64,6 +66,7 @@
sanitize_filename, sanitize_filename,
sanitize_url, sanitize_url,
sanitized_Request, sanitized_Request,
smuggle_url,
str_or_none, str_or_none,
str_to_int, str_to_int,
strip_or_none, strip_or_none,
@ -154,6 +157,7 @@ class InfoExtractor:
* abr Average audio bitrate in KBit/s * abr Average audio bitrate in KBit/s
* acodec Name of the audio codec in use * acodec Name of the audio codec in use
* asr Audio sampling rate in Hertz * asr Audio sampling rate in Hertz
* audio_channels Number of audio channels
* vbr Average video bitrate in KBit/s * vbr Average video bitrate in KBit/s
* fps Frame rate * fps Frame rate
* vcodec Name of the video codec in use * vcodec Name of the video codec in use
@ -281,6 +285,7 @@ class InfoExtractor:
captions instead of normal subtitles captions instead of normal subtitles
duration: Length of the video in seconds, as an integer or float. duration: Length of the video in seconds, as an integer or float.
view_count: How many users have watched the video on the platform. view_count: How many users have watched the video on the platform.
concurrent_view_count: How many users are currently watching the video on the platform.
like_count: Number of positive ratings of the video like_count: Number of positive ratings of the video
dislike_count: Number of negative ratings of the video dislike_count: Number of negative ratings of the video
repost_count: Number of reposts of the video repost_count: Number of reposts of the video
@ -330,7 +335,7 @@ class InfoExtractor:
playable_in_embed: Whether this video is allowed to play in embedded playable_in_embed: Whether this video is allowed to play in embedded
players on other sites. Can be True (=always allowed), players on other sites. Can be True (=always allowed),
False (=never allowed), None (=unknown), or a string False (=never allowed), None (=unknown), or a string
specifying the criteria for embedability (Eg: 'whitelist') specifying the criteria for embedability; e.g. 'whitelist'
availability: Under what condition the video is available. One of availability: Under what condition the video is available. One of
'private', 'premium_only', 'subscriber_only', 'needs_auth', 'private', 'premium_only', 'subscriber_only', 'needs_auth',
'unlisted' or 'public'. Use 'InfoExtractor._availability' 'unlisted' or 'public'. Use 'InfoExtractor._availability'
@ -451,8 +456,8 @@ class InfoExtractor:
_extract_from_webpage may raise self.StopExtraction() to stop further _extract_from_webpage may raise self.StopExtraction() to stop further
processing of the webpage and obtain exclusive rights to it. This is useful processing of the webpage and obtain exclusive rights to it. This is useful
when the extractor cannot reliably be matched using just the URL. when the extractor cannot reliably be matched using just the URL,
Eg: invidious/peertube instances e.g. invidious/peertube instances
Embed-only extractors can be defined by setting _VALID_URL = False. Embed-only extractors can be defined by setting _VALID_URL = False.
@ -479,6 +484,9 @@ class InfoExtractor:
will be used by geo restriction bypass mechanism similarly will be used by geo restriction bypass mechanism similarly
to _GEO_COUNTRIES. to _GEO_COUNTRIES.
The _ENABLED attribute should be set to False for IEs that
are disabled by default and must be explicitly enabled.
The _WORKING attribute should be set to False for broken IEs The _WORKING attribute should be set to False for broken IEs
in order to warn the users and skip the tests. in order to warn the users and skip the tests.
""" """
@ -490,6 +498,7 @@ class InfoExtractor:
_GEO_COUNTRIES = None _GEO_COUNTRIES = None
_GEO_IP_BLOCKS = None _GEO_IP_BLOCKS = None
_WORKING = True _WORKING = True
_ENABLED = True
_NETRC_MACHINE = None _NETRC_MACHINE = None
IE_DESC = None IE_DESC = None
SEARCH_KEY = None SEARCH_KEY = None
@ -504,7 +513,7 @@ def _login_hint(self, method=NO_DEFAULT, netrc=None):
'password': f'Use {password_hint}', 'password': f'Use {password_hint}',
'cookies': ( 'cookies': (
'Use --cookies-from-browser or --cookies for the authentication. ' 'Use --cookies-from-browser or --cookies for the authentication. '
'See https://github.com/ytdl-org/youtube-dl#how-do-i-pass-cookies-to-youtube-dl for how to manually pass cookies'), 'See https://github.com/yt-dlp/yt-dlp/wiki/FAQ#how-do-i-pass-cookies-to-yt-dlp for how to manually pass cookies'),
}[method if method is not NO_DEFAULT else 'any' if self.supports_login() else 'cookies'] }[method if method is not NO_DEFAULT else 'any' if self.supports_login() else 'cookies']
def __init__(self, downloader=None): def __init__(self, downloader=None):
@ -1099,7 +1108,9 @@ def get_param(self, name, default=None, *args, **kwargs):
return self._downloader.params.get(name, default, *args, **kwargs) return self._downloader.params.get(name, default, *args, **kwargs)
return default return default
def report_drm(self, video_id, partial=False): def report_drm(self, video_id, partial=NO_DEFAULT):
if partial is not NO_DEFAULT:
self._downloader.deprecation_warning('InfoExtractor.report_drm no longer accepts the argument partial')
self.raise_no_formats('This video is DRM protected', expected=True, video_id=video_id) self.raise_no_formats('This video is DRM protected', expected=True, video_id=video_id)
def report_extraction(self, id_or_name): def report_extraction(self, id_or_name):
@ -1220,7 +1231,7 @@ def _search_regex(self, pattern, string, name, default=NO_DEFAULT, fatal=True, f
return None return None
def _search_json(self, start_pattern, string, name, video_id, *, end_pattern='', def _search_json(self, start_pattern, string, name, video_id, *, end_pattern='',
contains_pattern='(?s:.+)', fatal=True, default=NO_DEFAULT, **kwargs): contains_pattern=r'{(?s:.+)}', fatal=True, default=NO_DEFAULT, **kwargs):
"""Searches string for the JSON object specified by start_pattern""" """Searches string for the JSON object specified by start_pattern"""
# NB: end_pattern is only used to reduce the size of the initial match # NB: end_pattern is only used to reduce the size of the initial match
if default is NO_DEFAULT: if default is NO_DEFAULT:
@ -1229,7 +1240,7 @@ def _search_json(self, start_pattern, string, name, video_id, *, end_pattern='',
fatal, has_default = False, True fatal, has_default = False, True
json_string = self._search_regex( json_string = self._search_regex(
rf'{start_pattern}\s*(?P<json>{{\s*{contains_pattern}\s*}})\s*{end_pattern}', rf'(?:{start_pattern})\s*(?P<json>{contains_pattern})\s*(?:{end_pattern})',
string, name, group='json', fatal=fatal, default=None if has_default else NO_DEFAULT) string, name, group='json', fatal=fatal, default=None if has_default else NO_DEFAULT)
if not json_string: if not json_string:
return default return default
@ -1459,10 +1470,6 @@ def _json_ld(self, json_ld, video_id, fatal=True, expected_type=None):
if not json_ld: if not json_ld:
return {} return {}
info = {} info = {}
if not isinstance(json_ld, (list, tuple, dict)):
return info
if isinstance(json_ld, dict):
json_ld = [json_ld]
INTERACTION_TYPE_MAP = { INTERACTION_TYPE_MAP = {
'CommentAction': 'comment', 'CommentAction': 'comment',
@ -1529,10 +1536,10 @@ def extract_chapter_information(e):
info['chapters'] = chapters info['chapters'] = chapters
def extract_video_object(e): def extract_video_object(e):
assert is_type(e, 'VideoObject')
author = e.get('author') author = e.get('author')
info.update({ info.update({
'url': url_or_none(e.get('contentUrl')), 'url': url_or_none(e.get('contentUrl')),
'ext': mimetype2ext(e.get('encodingFormat')),
'title': unescapeHTML(e.get('name')), 'title': unescapeHTML(e.get('name')),
'description': unescapeHTML(e.get('description')), 'description': unescapeHTML(e.get('description')),
'thumbnails': [{'url': unescapeHTML(url)} 'thumbnails': [{'url': unescapeHTML(url)}
@ -1545,21 +1552,30 @@ def extract_video_object(e):
# however some websites are using 'Text' type instead. # however some websites are using 'Text' type instead.
# 1. https://schema.org/VideoObject # 1. https://schema.org/VideoObject
'uploader': author.get('name') if isinstance(author, dict) else author if isinstance(author, str) else None, 'uploader': author.get('name') if isinstance(author, dict) else author if isinstance(author, str) else None,
'artist': traverse_obj(e, ('byArtist', 'name'), expected_type=str),
'filesize': int_or_none(float_or_none(e.get('contentSize'))), 'filesize': int_or_none(float_or_none(e.get('contentSize'))),
'tbr': int_or_none(e.get('bitrate')), 'tbr': int_or_none(e.get('bitrate')),
'width': int_or_none(e.get('width')), 'width': int_or_none(e.get('width')),
'height': int_or_none(e.get('height')), 'height': int_or_none(e.get('height')),
'view_count': int_or_none(e.get('interactionCount')), 'view_count': int_or_none(e.get('interactionCount')),
'tags': try_call(lambda: e.get('keywords').split(',')),
})
if is_type(e, 'AudioObject'):
info.update({
'vcodec': 'none',
'abr': int_or_none(e.get('bitrate')),
}) })
extract_interaction_statistic(e) extract_interaction_statistic(e)
extract_chapter_information(e) extract_chapter_information(e)
def traverse_json_ld(json_ld, at_top_level=True): def traverse_json_ld(json_ld, at_top_level=True):
for e in json_ld: for e in variadic(json_ld):
if not isinstance(e, dict):
continue
if at_top_level and '@context' not in e: if at_top_level and '@context' not in e:
continue continue
if at_top_level and set(e.keys()) == {'@context', '@graph'}: if at_top_level and set(e.keys()) == {'@context', '@graph'}:
traverse_json_ld(variadic(e['@graph'], allowed_types=(dict,)), at_top_level=False) traverse_json_ld(e['@graph'], at_top_level=False)
break break
if expected_type is not None and not is_type(e, expected_type): if expected_type is not None and not is_type(e, expected_type):
continue continue
@ -1601,7 +1617,7 @@ def traverse_json_ld(json_ld, at_top_level=True):
extract_video_object(e['video'][0]) extract_video_object(e['video'][0])
elif is_type(traverse_obj(e, ('subjectOf', 0)), 'VideoObject'): elif is_type(traverse_obj(e, ('subjectOf', 0)), 'VideoObject'):
extract_video_object(e['subjectOf'][0]) extract_video_object(e['subjectOf'][0])
elif is_type(e, 'VideoObject'): elif is_type(e, 'VideoObject', 'AudioObject'):
extract_video_object(e) extract_video_object(e)
if expected_type is None: if expected_type is None:
continue continue
@ -1614,8 +1630,8 @@ def traverse_json_ld(json_ld, at_top_level=True):
continue continue
else: else:
break break
traverse_json_ld(json_ld)
traverse_json_ld(json_ld)
return filter_dict(info) return filter_dict(info)
def _search_nextjs_data(self, webpage, video_id, *, transform_source=None, fatal=True, **kw): def _search_nextjs_data(self, webpage, video_id, *, transform_source=None, fatal=True, **kw):
@ -1668,8 +1684,8 @@ class FormatSort:
regex = r' *((?P<reverse>\+)?(?P<field>[a-zA-Z0-9_]+)((?P<separator>[~:])(?P<limit>.*?))?)? *$' regex = r' *((?P<reverse>\+)?(?P<field>[a-zA-Z0-9_]+)((?P<separator>[~:])(?P<limit>.*?))?)? *$'
default = ('hidden', 'aud_or_vid', 'hasvid', 'ie_pref', 'lang', 'quality', default = ('hidden', 'aud_or_vid', 'hasvid', 'ie_pref', 'lang', 'quality',
'res', 'fps', 'hdr:12', 'codec:vp9.2', 'size', 'br', 'asr', 'res', 'fps', 'hdr:12', 'vcodec:vp9.2', 'channels', 'acodec',
'proto', 'ext', 'hasaud', 'source', 'id') # These must not be aliases 'size', 'br', 'asr', 'proto', 'ext', 'hasaud', 'source', 'id') # These must not be aliases
ytdl_default = ('hasaud', 'lang', 'quality', 'tbr', 'filesize', 'vbr', ytdl_default = ('hasaud', 'lang', 'quality', 'tbr', 'filesize', 'vbr',
'height', 'width', 'proto', 'vext', 'abr', 'aext', 'height', 'width', 'proto', 'vext', 'abr', 'aext',
'fps', 'fs_approx', 'source', 'id') 'fps', 'fs_approx', 'source', 'id')
@ -1688,7 +1704,7 @@ class FormatSort:
'order_free': ('webm', 'mp4', 'flv', '', 'none')}, 'order_free': ('webm', 'mp4', 'flv', '', 'none')},
'aext': {'type': 'ordered', 'field': 'audio_ext', 'aext': {'type': 'ordered', 'field': 'audio_ext',
'order': ('m4a', 'aac', 'mp3', 'ogg', 'opus', 'webm', '', 'none'), 'order': ('m4a', 'aac', 'mp3', 'ogg', 'opus', 'webm', '', 'none'),
'order_free': ('opus', 'ogg', 'webm', 'm4a', 'mp3', 'aac', '', 'none')}, 'order_free': ('ogg', 'opus', 'webm', 'mp3', 'm4a', 'aac', '', 'none')},
'hidden': {'visible': False, 'forced': True, 'type': 'extractor', 'max': -1000}, 'hidden': {'visible': False, 'forced': True, 'type': 'extractor', 'max': -1000},
'aud_or_vid': {'visible': False, 'forced': True, 'type': 'multiple', 'aud_or_vid': {'visible': False, 'forced': True, 'type': 'multiple',
'field': ('vcodec', 'acodec'), 'field': ('vcodec', 'acodec'),
@ -1704,6 +1720,7 @@ class FormatSort:
'height': {'convert': 'float_none'}, 'height': {'convert': 'float_none'},
'width': {'convert': 'float_none'}, 'width': {'convert': 'float_none'},
'fps': {'convert': 'float_none'}, 'fps': {'convert': 'float_none'},
'channels': {'convert': 'float_none', 'field': 'audio_channels'},
'tbr': {'convert': 'float_none'}, 'tbr': {'convert': 'float_none'},
'vbr': {'convert': 'float_none'}, 'vbr': {'convert': 'float_none'},
'abr': {'convert': 'float_none'}, 'abr': {'convert': 'float_none'},
@ -1717,13 +1734,14 @@ class FormatSort:
'res': {'type': 'multiple', 'field': ('height', 'width'), 'res': {'type': 'multiple', 'field': ('height', 'width'),
'function': lambda it: (lambda l: min(l) if l else 0)(tuple(filter(None, it)))}, 'function': lambda it: (lambda l: min(l) if l else 0)(tuple(filter(None, it)))},
# For compatibility with youtube-dl # Actual field names
'format_id': {'type': 'alias', 'field': 'id'}, 'format_id': {'type': 'alias', 'field': 'id'},
'preference': {'type': 'alias', 'field': 'ie_pref'}, 'preference': {'type': 'alias', 'field': 'ie_pref'},
'language_preference': {'type': 'alias', 'field': 'lang'}, 'language_preference': {'type': 'alias', 'field': 'lang'},
'source_preference': {'type': 'alias', 'field': 'source'}, 'source_preference': {'type': 'alias', 'field': 'source'},
'protocol': {'type': 'alias', 'field': 'proto'}, 'protocol': {'type': 'alias', 'field': 'proto'},
'filesize_approx': {'type': 'alias', 'field': 'fs_approx'}, 'filesize_approx': {'type': 'alias', 'field': 'fs_approx'},
'audio_channels': {'type': 'alias', 'field': 'channels'},
# Deprecated # Deprecated
'dimension': {'type': 'alias', 'field': 'res', 'deprecated': True}, 'dimension': {'type': 'alias', 'field': 'res', 'deprecated': True},
@ -1759,9 +1777,8 @@ def _get_field_setting(self, field, key):
if field not in self.settings: if field not in self.settings:
if key in ('forced', 'priority'): if key in ('forced', 'priority'):
return False return False
self.ydl.deprecation_warning( self.ydl.deprecated_feature(f'Using arbitrary fields ({field}) for format sorting is '
f'Using arbitrary fields ({field}) for format sorting is deprecated ' 'deprecated and may be removed in a future version')
'and may be removed in a future version')
self.settings[field] = {} self.settings[field] = {}
propObj = self.settings[field] propObj = self.settings[field]
if key not in propObj: if key not in propObj:
@ -1846,9 +1863,8 @@ def add_item(field, reverse, closest, limit_text):
if self._get_field_setting(field, 'type') == 'alias': if self._get_field_setting(field, 'type') == 'alias':
alias, field = field, self._get_field_setting(field, 'field') alias, field = field, self._get_field_setting(field, 'field')
if self._get_field_setting(alias, 'deprecated'): if self._get_field_setting(alias, 'deprecated'):
self.ydl.deprecation_warning( self.ydl.deprecated_feature(f'Format sorting alias {alias} is deprecated and may '
f'Format sorting alias {alias} is deprecated ' f'be removed in a future version. Please use {field} instead')
f'and may be removed in a future version. Please use {field} instead')
reverse = match.group('reverse') is not None reverse = match.group('reverse') is not None
closest = match.group('separator') == '~' closest = match.group('separator') == '~'
limit_text = match.group('limit') limit_text = match.group('limit')
@ -2364,7 +2380,7 @@ def build_stream_name():
audio_group_id = last_stream_inf.get('AUDIO') audio_group_id = last_stream_inf.get('AUDIO')
# As per [1, 4.3.4.1.1] any EXT-X-STREAM-INF tag which # As per [1, 4.3.4.1.1] any EXT-X-STREAM-INF tag which
# references a rendition group MUST have a CODECS attribute. # references a rendition group MUST have a CODECS attribute.
# However, this is not always respected, for example, [2] # However, this is not always respected. E.g. [2]
# contains EXT-X-STREAM-INF tag which references AUDIO # contains EXT-X-STREAM-INF tag which references AUDIO
# rendition group but does not have CODECS and despite # rendition group but does not have CODECS and despite
# referencing an audio group it represents a complete # referencing an audio group it represents a complete
@ -2909,6 +2925,8 @@ def extract_Initialization(source):
def prepare_template(template_name, identifiers): def prepare_template(template_name, identifiers):
tmpl = representation_ms_info[template_name] tmpl = representation_ms_info[template_name]
if representation_id is not None:
tmpl = tmpl.replace('$RepresentationID$', representation_id)
# First of, % characters outside $...$ templates # First of, % characters outside $...$ templates
# must be escaped by doubling for proper processing # must be escaped by doubling for proper processing
# by % operator string formatting used further (see # by % operator string formatting used further (see
@ -2923,8 +2941,6 @@ def prepare_template(template_name, identifiers):
t += c t += c
# Next, $...$ templates are translated to their # Next, $...$ templates are translated to their
# %(...) counterparts to be used with % operator # %(...) counterparts to be used with % operator
if representation_id is not None:
t = t.replace('$RepresentationID$', representation_id)
t = re.sub(r'\$(%s)\$' % '|'.join(identifiers), r'%(\1)d', t) t = re.sub(r'\$(%s)\$' % '|'.join(identifiers), r'%(\1)d', t)
t = re.sub(r'\$(%s)%%([^$]+)\$' % '|'.join(identifiers), r'%(\1)\2', t) t = re.sub(r'\$(%s)%%([^$]+)\$' % '|'.join(identifiers), r'%(\1)\2', t)
t.replace('$$', '$') t.replace('$$', '$')
@ -3000,8 +3016,8 @@ def add_segment_url():
segment_number += 1 segment_number += 1
segment_time += segment_d segment_time += segment_d
elif 'segment_urls' in representation_ms_info and 's' in representation_ms_info: elif 'segment_urls' in representation_ms_info and 's' in representation_ms_info:
# No media template # No media template,
# Example: https://www.youtube.com/watch?v=iXZV5uAYMJI # e.g. https://www.youtube.com/watch?v=iXZV5uAYMJI
# or any YouTube dashsegments video # or any YouTube dashsegments video
fragments = [] fragments = []
segment_index = 0 segment_index = 0
@ -3018,7 +3034,7 @@ def add_segment_url():
representation_ms_info['fragments'] = fragments representation_ms_info['fragments'] = fragments
elif 'segment_urls' in representation_ms_info: elif 'segment_urls' in representation_ms_info:
# Segment URLs with no SegmentTimeline # Segment URLs with no SegmentTimeline
# Example: https://www.seznam.cz/zpravy/clanek/cesko-zasahne-vitr-o-sile-vichrice-muze-byt-i-zivotu-nebezpecny-39091 # E.g. https://www.seznam.cz/zpravy/clanek/cesko-zasahne-vitr-o-sile-vichrice-muze-byt-i-zivotu-nebezpecny-39091
# https://github.com/ytdl-org/youtube-dl/pull/14844 # https://github.com/ytdl-org/youtube-dl/pull/14844
fragments = [] fragments = []
segment_duration = float_or_none( segment_duration = float_or_none(
@ -3110,9 +3126,10 @@ def _parse_ism_formats_and_subtitles(self, ism_doc, ism_url, ism_id=None):
stream_name = stream.get('Name') stream_name = stream.get('Name')
stream_language = stream.get('Language', 'und') stream_language = stream.get('Language', 'und')
for track in stream.findall('QualityLevel'): for track in stream.findall('QualityLevel'):
fourcc = track.get('FourCC') or ('AACL' if track.get('AudioTag') == '255' else None) KNOWN_TAGS = {'255': 'AACL', '65534': 'EC-3'}
fourcc = track.get('FourCC') or KNOWN_TAGS.get(track.get('AudioTag'))
# TODO: add support for WVC1 and WMAP # TODO: add support for WVC1 and WMAP
if fourcc not in ('H264', 'AVC1', 'AACL', 'TTML'): if fourcc not in ('H264', 'AVC1', 'AACL', 'TTML', 'EC-3'):
self.report_warning('%s is not a supported codec' % fourcc) self.report_warning('%s is not a supported codec' % fourcc)
continue continue
tbr = int(track.attrib['Bitrate']) // 1000 tbr = int(track.attrib['Bitrate']) // 1000
@ -3246,8 +3263,8 @@ def _media_formats(src, cur_media_type, type_info=None):
media_tags.extend(re.findall( media_tags.extend(re.findall(
# We only allow video|audio followed by a whitespace or '>'. # We only allow video|audio followed by a whitespace or '>'.
# Allowing more characters may end up in significant slow down (see # Allowing more characters may end up in significant slow down (see
# https://github.com/ytdl-org/youtube-dl/issues/11979, example URL: # https://github.com/ytdl-org/youtube-dl/issues/11979,
# http://www.porntrex.com/maps/videositemap.xml). # e.g. http://www.porntrex.com/maps/videositemap.xml).
r'(?s)(<(?P<tag>%s)(?:\s+[^>]*)?>)(.*?)</(?P=tag)>' % _MEDIA_TAG_NAME_RE, webpage)) r'(?s)(<(?P<tag>%s)(?:\s+[^>]*)?>)(.*?)</(?P=tag)>' % _MEDIA_TAG_NAME_RE, webpage))
for media_tag, _, media_type, media_content in media_tags: for media_tag, _, media_type, media_content in media_tags:
media_info = { media_info = {
@ -3255,7 +3272,7 @@ def _media_formats(src, cur_media_type, type_info=None):
'subtitles': {}, 'subtitles': {},
} }
media_attributes = extract_attributes(media_tag) media_attributes = extract_attributes(media_tag)
src = strip_or_none(media_attributes.get('src')) src = strip_or_none(dict_get(media_attributes, ('src', 'data-video-src', 'data-src', 'data-source')))
if src: if src:
f = parse_content_type(media_attributes.get('type')) f = parse_content_type(media_attributes.get('type'))
_, formats = _media_formats(src, media_type, f) _, formats = _media_formats(src, media_type, f)
@ -3266,7 +3283,7 @@ def _media_formats(src, cur_media_type, type_info=None):
s_attr = extract_attributes(source_tag) s_attr = extract_attributes(source_tag)
# data-video-src and data-src are non standard but seen # data-video-src and data-src are non standard but seen
# several times in the wild # several times in the wild
src = strip_or_none(dict_get(s_attr, ('src', 'data-video-src', 'data-src'))) src = strip_or_none(dict_get(s_attr, ('src', 'data-video-src', 'data-src', 'data-source')))
if not src: if not src:
continue continue
f = parse_content_type(s_attr.get('type')) f = parse_content_type(s_attr.get('type'))
@ -3572,7 +3589,8 @@ def _parse_jwplayer_formats(self, jwplayer_sources_data, video_id=None,
'url': source_url, 'url': source_url,
'width': int_or_none(source.get('width')), 'width': int_or_none(source.get('width')),
'height': height, 'height': height,
'tbr': int_or_none(source.get('bitrate')), 'tbr': int_or_none(source.get('bitrate'), scale=1000),
'filesize': int_or_none(source.get('filesize')),
'ext': ext, 'ext': ext,
} }
if source_url.startswith('rtmp'): if source_url.startswith('rtmp'):
@ -3626,7 +3644,7 @@ def _set_cookie(self, domain, name, value, expire_time=None, port=None,
def _get_cookies(self, url): def _get_cookies(self, url):
""" Return a http.cookies.SimpleCookie with the cookies for the url """ """ Return a http.cookies.SimpleCookie with the cookies for the url """
return http.cookies.SimpleCookie(self._downloader._calc_cookies(url)) return LenientSimpleCookie(self._downloader._calc_cookies(url))
def _apply_first_set_cookie_header(self, url_handle, cookie): def _apply_first_set_cookie_header(self, url_handle, cookie):
""" """
@ -3703,7 +3721,7 @@ def description(cls, *, markdown=True, search_examples=None):
desc += f'; "{cls.SEARCH_KEY}:" prefix' desc += f'; "{cls.SEARCH_KEY}:" prefix'
if search_examples: if search_examples:
_COUNTS = ('', '5', '10', 'all') _COUNTS = ('', '5', '10', 'all')
desc += f' (Example: "{cls.SEARCH_KEY}{random.choice(_COUNTS)}:{random.choice(search_examples)}")' desc += f' (e.g. "{cls.SEARCH_KEY}{random.choice(_COUNTS)}:{random.choice(search_examples)}")'
if not cls.working(): if not cls.working():
desc += ' (**Currently broken**)' if markdown else ' (Currently broken)' desc += ' (**Currently broken**)' if markdown else ' (Currently broken)'
@ -3827,8 +3845,8 @@ def _configuration_arg(self, key, default=NO_DEFAULT, *, ie_key=None, casesense=
@param default The default value to return when the key is not present (default: []) @param default The default value to return when the key is not present (default: [])
@param casesense When false, the values are converted to lower case @param casesense When false, the values are converted to lower case
''' '''
val = traverse_obj( ie_key = ie_key if isinstance(ie_key, str) else (ie_key or self).ie_key()
self._downloader.params, ('extractor_args', (ie_key or self.ie_key()).lower(), key)) val = traverse_obj(self._downloader.params, ('extractor_args', ie_key.lower(), key))
if val is None: if val is None:
return [] if default is NO_DEFAULT else default return [] if default is NO_DEFAULT else default
return list(val) if casesense else [x.lower() for x in val] return list(val) if casesense else [x.lower() for x in val]
@ -3850,12 +3868,20 @@ def _yes_playlist(self, playlist_id, video_id, smuggled_data=None, *, playlist_l
return True return True
def _error_or_warning(self, err, _count=None, _retries=0, *, fatal=True): def _error_or_warning(self, err, _count=None, _retries=0, *, fatal=True):
RetryManager.report_retry(err, _count or int(fatal), _retries, info=self.to_screen, warn=self.report_warning, RetryManager.report_retry(
err, _count or int(fatal), _retries,
info=self.to_screen, warn=self.report_warning, error=None if fatal else self.report_warning,
sleep_func=self.get_param('retry_sleep_functions', {}).get('extractor')) sleep_func=self.get_param('retry_sleep_functions', {}).get('extractor'))
def RetryManager(self, **kwargs): def RetryManager(self, **kwargs):
return RetryManager(self.get_param('extractor_retries', 3), self._error_or_warning, **kwargs) return RetryManager(self.get_param('extractor_retries', 3), self._error_or_warning, **kwargs)
def _extract_generic_embeds(self, url, *args, info_dict={}, note='Extracting generic embeds', **kwargs):
display_id = traverse_obj(info_dict, 'display_id', 'id')
self.to_screen(f'{format_field(display_id, None, "%s: ")}{note}')
return self._downloader.get_info_extractor('Generic')._extract_embeds(
smuggle_url(url, {'block_ies': [self.ie_key()]}), *args, **kwargs)
@classmethod @classmethod
def extract_from_webpage(cls, ydl, url, webpage): def extract_from_webpage(cls, ydl, url, webpage):
ie = (cls if isinstance(cls._extract_from_webpage, types.MethodType) ie = (cls if isinstance(cls._extract_from_webpage, types.MethodType)
@ -3869,7 +3895,7 @@ def extract_from_webpage(cls, ydl, url, webpage):
def _extract_from_webpage(cls, url, webpage): def _extract_from_webpage(cls, url, webpage):
for embed_url in orderedSet( for embed_url in orderedSet(
cls._extract_embed_urls(url, webpage) or [], lazy=True): cls._extract_embed_urls(url, webpage) or [], lazy=True):
yield cls.url_result(embed_url, cls) yield cls.url_result(embed_url, None if cls._VALID_URL is False else cls)
@classmethod @classmethod
def _extract_embed_urls(cls, url, webpage): def _extract_embed_urls(cls, url, webpage):
@ -3895,6 +3921,18 @@ def _extract_url(cls, webpage): # TODO: Remove
"""Only for compatibility with some older extractors""" """Only for compatibility with some older extractors"""
return next(iter(cls._extract_embed_urls(None, webpage) or []), None) return next(iter(cls._extract_embed_urls(None, webpage) or []), None)
@classmethod
def __init_subclass__(cls, *, plugin_name=None, **kwargs):
if plugin_name:
mro = inspect.getmro(cls)
super_class = cls.__wrapped__ = mro[mro.index(cls) + 1]
cls.IE_NAME, cls.ie_key = f'{super_class.IE_NAME}+{plugin_name}', super_class.ie_key
while getattr(super_class, '__wrapped__', None):
super_class = super_class.__wrapped__
setattr(sys.modules[super_class.__module__], super_class.__name__, cls)
return super().__init_subclass__(**kwargs)
class SearchInfoExtractor(InfoExtractor): class SearchInfoExtractor(InfoExtractor):
""" """
@ -3938,3 +3976,12 @@ def _search_results(self, query):
@classproperty @classproperty
def SEARCH_KEY(cls): def SEARCH_KEY(cls):
return cls._SEARCH_KEY return cls._SEARCH_KEY
class UnsupportedURLIE(InfoExtractor):
_VALID_URL = '.*'
_ENABLED = False
IE_DESC = False
def _real_extract(self, url):
raise UnsupportedError(url)

View File

@ -114,7 +114,14 @@ def _add_skip_wall(url):
class CrunchyrollIE(CrunchyrollBaseIE, VRVBaseIE): class CrunchyrollIE(CrunchyrollBaseIE, VRVBaseIE):
IE_NAME = 'crunchyroll' IE_NAME = 'crunchyroll'
_VALID_URL = r'https?://(?:(?P<prefix>www|m)\.)?(?P<url>crunchyroll\.(?:com|fr)/(?:media(?:-|/\?id=)|(?!series/|watch/)(?:[^/]+/){1,2}[^/?&]*?)(?P<id>[0-9]+))(?:[/?&]|$)' _VALID_URL = r'''(?x)
https?://(?:(?P<prefix>www|m)\.)?(?P<url>
crunchyroll\.(?:com|fr)/(?:
media(?:-|/\?id=)|
(?!series/|watch/)(?:[^/]+/){1,2}[^/?&#]*?
)(?P<id>[0-9]+)
)(?:[/?&#]|$)'''
_TESTS = [{ _TESTS = [{
'url': 'http://www.crunchyroll.com/wanna-be-the-strongest-in-the-world/episode-1-an-idol-wrestler-is-born-645513', 'url': 'http://www.crunchyroll.com/wanna-be-the-strongest-in-the-world/episode-1-an-idol-wrestler-is-born-645513',
'info_dict': { 'info_dict': {
@ -713,15 +720,20 @@ class CrunchyrollBetaBaseIE(CrunchyrollBaseIE):
def _get_params(self, lang): def _get_params(self, lang):
if not CrunchyrollBetaBaseIE.params: if not CrunchyrollBetaBaseIE.params:
if self._get_cookies(f'https://beta.crunchyroll.com/{lang}').get('etp_rt'):
grant_type, key = 'etp_rt_cookie', 'accountAuthClientId'
else:
grant_type, key = 'client_id', 'anonClientId'
initial_state, app_config = self._get_beta_embedded_json(self._download_webpage( initial_state, app_config = self._get_beta_embedded_json(self._download_webpage(
f'https://beta.crunchyroll.com/{lang}', None, note='Retrieving main page'), None) f'https://beta.crunchyroll.com/{lang}', None, note='Retrieving main page'), None)
api_domain = app_config['cxApiParams']['apiDomain'] api_domain = app_config['cxApiParams']['apiDomain']
basic_token = str(base64.b64encode(('%s:' % app_config['cxApiParams']['accountAuthClientId']).encode('ascii')), 'ascii')
auth_response = self._download_json( auth_response = self._download_json(
f'{api_domain}/auth/v1/token', None, note='Authenticating with cookie', f'{api_domain}/auth/v1/token', None, note=f'Authenticating with grant_type={grant_type}',
headers={ headers={
'Authorization': 'Basic ' + basic_token 'Authorization': 'Basic ' + str(base64.b64encode(('%s:' % app_config['cxApiParams'][key]).encode('ascii')), 'ascii')
}, data='grant_type=etp_rt_cookie'.encode('ascii')) }, data=f'grant_type={grant_type}'.encode('ascii'))
policy_response = self._download_json( policy_response = self._download_json(
f'{api_domain}/index/v2', None, note='Retrieving signed policy', f'{api_domain}/index/v2', None, note='Retrieving signed policy',
headers={ headers={
@ -740,25 +752,14 @@ def _get_params(self, lang):
CrunchyrollBetaBaseIE.params = (api_domain, bucket, params) CrunchyrollBetaBaseIE.params = (api_domain, bucket, params)
return CrunchyrollBetaBaseIE.params return CrunchyrollBetaBaseIE.params
def _redirect_from_beta(self, url, lang, internal_id, display_id, is_episode, iekey):
initial_state, app_config = self._get_beta_embedded_json(self._download_webpage(url, display_id), display_id)
content_data = initial_state['content']['byId'][internal_id]
if is_episode:
video_id = content_data['external_id'].split('.')[1]
series_id = content_data['episode_metadata']['series_slug_title']
else:
series_id = content_data['slug_title']
series_id = re.sub(r'-{2,}', '-', series_id)
url = f'https://www.crunchyroll.com/{lang}{series_id}'
if is_episode:
url = url + f'/{display_id}-{video_id}'
self.to_screen(f'{display_id}: Not logged in. Redirecting to non-beta site - {url}')
return self.url_result(url, iekey, display_id)
class CrunchyrollBetaIE(CrunchyrollBetaBaseIE): class CrunchyrollBetaIE(CrunchyrollBetaBaseIE):
IE_NAME = 'crunchyroll:beta' IE_NAME = 'crunchyroll:beta'
_VALID_URL = r'https?://beta\.crunchyroll\.com/(?P<lang>(?:\w{2}(?:-\w{2})?/)?)watch/(?P<id>\w+)/(?P<display_id>[\w\-]*)/?(?:\?|$)' _VALID_URL = r'''(?x)
https?://beta\.crunchyroll\.com/
(?P<lang>(?:\w{2}(?:-\w{2})?/)?)
watch/(?P<id>\w+)
(?:/(?P<display_id>[\w-]+))?/?(?:[?#]|$)'''
_TESTS = [{ _TESTS = [{
'url': 'https://beta.crunchyroll.com/watch/GY2P1Q98Y/to-the-future', 'url': 'https://beta.crunchyroll.com/watch/GY2P1Q98Y/to-the-future',
'info_dict': { 'info_dict': {
@ -778,9 +779,30 @@ class CrunchyrollBetaIE(CrunchyrollBetaBaseIE):
'episode_number': 73, 'episode_number': 73,
'thumbnail': r're:^https://beta.crunchyroll.com/imgsrv/.*\.jpeg$', 'thumbnail': r're:^https://beta.crunchyroll.com/imgsrv/.*\.jpeg$',
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8', 'format': 'all[format_id~=hardsub]'},
}, { }, {
'url': 'https://beta.crunchyroll.com/watch/GY2P1Q98Y/', 'url': 'https://beta.crunchyroll.com/watch/GYE5WKQGR',
'info_dict': {
'id': 'GYE5WKQGR',
'ext': 'mp4',
'duration': 366.459,
'timestamp': 1476788400,
'description': 'md5:74b67283ffddd75f6e224ca7dc031e76',
'title': 'SHELTER Episode Porter Robinson presents Shelter the Animation',
'upload_date': '20161018',
'series': 'SHELTER',
'series_id': 'GYGG09WWY',
'season': 'SHELTER',
'season_id': 'GR09MGK4R',
'season_number': 1,
'episode': 'Porter Robinson presents Shelter the Animation',
'episode_number': 0,
'thumbnail': r're:^https://beta.crunchyroll.com/imgsrv/.*\.jpeg$',
},
'params': {'skip_download': True},
'skip': 'Video is Premium only',
}, {
'url': 'https://beta.crunchyroll.com/watch/GY2P1Q98Y',
'only_matching': True, 'only_matching': True,
}, { }, {
'url': 'https://beta.crunchyroll.com/pt-br/watch/G8WUN8VKP/the-ruler-of-conspiracy', 'url': 'https://beta.crunchyroll.com/pt-br/watch/G8WUN8VKP/the-ruler-of-conspiracy',
@ -789,10 +811,6 @@ class CrunchyrollBetaIE(CrunchyrollBetaBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
lang, internal_id, display_id = self._match_valid_url(url).group('lang', 'id', 'display_id') lang, internal_id, display_id = self._match_valid_url(url).group('lang', 'id', 'display_id')
if not self._get_cookies(url).get('etp_rt'):
return self._redirect_from_beta(url, lang, internal_id, display_id, True, CrunchyrollIE.ie_key())
api_domain, bucket, params = self._get_params(lang) api_domain, bucket, params = self._get_params(lang)
episode_response = self._download_json( episode_response = self._download_json(
@ -810,25 +828,43 @@ def _real_extract(self, url):
hardsub_preference = qualities(requested_hardsubs[::-1]) hardsub_preference = qualities(requested_hardsubs[::-1])
requested_formats = self._configuration_arg('format') or ['adaptive_hls'] requested_formats = self._configuration_arg('format') or ['adaptive_hls']
formats = [] available_formats = {}
for stream_type, streams in get_streams('streams'): for stream_type, streams in get_streams('streams'):
if stream_type not in requested_formats: if stream_type not in requested_formats:
continue continue
for stream in streams.values(): for stream in streams.values():
hardsub_lang = stream.get('hardsub_locale') or ''
if hardsub_lang.lower() not in requested_hardsubs:
continue
format_id = join_nonempty(stream_type, format_field(stream, 'hardsub_locale', 'hardsub-%s'))
if not stream.get('url'): if not stream.get('url'):
continue continue
hardsub_lang = stream.get('hardsub_locale') or ''
format_id = join_nonempty(stream_type, format_field(stream, 'hardsub_locale', 'hardsub-%s'))
available_formats[hardsub_lang] = (stream_type, format_id, hardsub_lang, stream['url'])
if '' in available_formats and 'all' not in requested_hardsubs:
full_format_langs = set(requested_hardsubs)
self.to_screen(
'To get all formats of a hardsub language, use '
'"--extractor-args crunchyrollbeta:hardsub=<language_code or all>". '
'See https://github.com/yt-dlp/yt-dlp#crunchyrollbeta for more info',
only_once=True)
else:
full_format_langs = set(map(str.lower, available_formats))
formats = []
for stream_type, format_id, hardsub_lang, stream_url in available_formats.values():
if stream_type.endswith('hls'): if stream_type.endswith('hls'):
if hardsub_lang.lower() in full_format_langs:
adaptive_formats = self._extract_m3u8_formats( adaptive_formats = self._extract_m3u8_formats(
stream['url'], display_id, 'mp4', m3u8_id=format_id, stream_url, display_id, 'mp4', m3u8_id=format_id,
fatal=False, note=f'Downloading {format_id} HLS manifest') fatal=False, note=f'Downloading {format_id} HLS manifest')
else:
adaptive_formats = (self._m3u8_meta_format(stream_url, ext='mp4', m3u8_id=format_id),)
elif stream_type.endswith('dash'): elif stream_type.endswith('dash'):
adaptive_formats = self._extract_mpd_formats( adaptive_formats = self._extract_mpd_formats(
stream['url'], display_id, mpd_id=format_id, stream_url, display_id, mpd_id=format_id,
fatal=False, note=f'Downloading {format_id} MPD manifest') fatal=False, note=f'Downloading {format_id} MPD manifest')
else:
self.report_warning(f'Encountered unknown stream_type: {stream_type!r}', display_id, only_once=True)
continue
for f in adaptive_formats: for f in adaptive_formats:
if f.get('acodec') != 'none': if f.get('acodec') != 'none':
f['language'] = stream_response.get('audio_locale') f['language'] = stream_response.get('audio_locale')
@ -867,7 +903,11 @@ def _real_extract(self, url):
class CrunchyrollBetaShowIE(CrunchyrollBetaBaseIE): class CrunchyrollBetaShowIE(CrunchyrollBetaBaseIE):
IE_NAME = 'crunchyroll:playlist:beta' IE_NAME = 'crunchyroll:playlist:beta'
_VALID_URL = r'https?://beta\.crunchyroll\.com/(?P<lang>(?:\w{2}(?:-\w{2})?/)?)series/(?P<id>\w+)/(?P<display_id>[\w\-]*)/?(?:\?|$)' _VALID_URL = r'''(?x)
https?://beta\.crunchyroll\.com/
(?P<lang>(?:\w{2}(?:-\w{2})?/)?)
series/(?P<id>\w+)
(?:/(?P<display_id>[\w-]+))?/?(?:[?#]|$)'''
_TESTS = [{ _TESTS = [{
'url': 'https://beta.crunchyroll.com/series/GY19NQ2QR/Girl-Friend-BETA', 'url': 'https://beta.crunchyroll.com/series/GY19NQ2QR/Girl-Friend-BETA',
'info_dict': { 'info_dict': {
@ -876,16 +916,12 @@ class CrunchyrollBetaShowIE(CrunchyrollBetaBaseIE):
}, },
'playlist_mincount': 10, 'playlist_mincount': 10,
}, { }, {
'url': 'https://beta.crunchyroll.com/it/series/GY19NQ2QR/Girl-Friend-BETA', 'url': 'https://beta.crunchyroll.com/it/series/GY19NQ2QR',
'only_matching': True, 'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
lang, internal_id, display_id = self._match_valid_url(url).group('lang', 'id', 'display_id') lang, internal_id, display_id = self._match_valid_url(url).group('lang', 'id', 'display_id')
if not self._get_cookies(url).get('etp_rt'):
return self._redirect_from_beta(url, lang, internal_id, display_id, False, CrunchyrollShowPlaylistIE.ie_key())
api_domain, bucket, params = self._get_params(lang) api_domain, bucket, params = self._get_params(lang)
series_response = self._download_json( series_response = self._download_json(

View File

@ -1,11 +1,10 @@
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
smuggle_url, smuggle_url,
str_or_none, str_or_none,
traverse_obj, traverse_obj,
urlencode_postdata urlencode_postdata,
) )

View File

@ -1,122 +1,160 @@
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import merge_dicts, str_or_none from ..utils import int_or_none, merge_dicts, try_call, url_basename
class Detik20IE(InfoExtractor): class DetikEmbedIE(InfoExtractor):
IE_NAME = '20.detik.com' _VALID_URL = False
_VALID_URL = r'https?://20\.detik\.com/((?!program)[\w-]+)/[\d-]+/(?P<id>[\w-]+)' _WEBPAGE_TESTS = [{
_TESTS = [{ # cnn embed
# detikflash 'url': 'https://www.cnnindonesia.com/embed/video/846189',
'url': 'https://20.detik.com/detikflash/20220705-220705098/zulhas-klaim-sukses-turunkan-harga-migor-jawa-bali',
'info_dict': { 'info_dict': {
'id': '220705098', 'id': '846189',
'ext': 'mp4', 'ext': 'mp4',
'duration': 157, 'description': 'md5:ece7b003b3ee7d81c6a5cfede7d5397d',
'thumbnail': 'https://cdnv.detik.com/videoservice/AdminTV/2022/07/05/bfe0384db04f4bbb9dd5efc869c5d4b1-20220705164334-0s.jpg?w=650&q=80', 'thumbnail': r're:https?://akcdn\.detik\.net\.id/visual/2022/09/11/thumbnail-video-1_169.jpeg',
'description': 'md5:ac18dcee5b107abbec1ed46e0bf400e3', 'title': 'Video CNN Indonesia - VIDEO: Momen Charles Disambut Meriah usai Dilantik jadi Raja Inggris',
'title': 'Zulhas Klaim Sukses Turunkan Harga Migor Jawa-Bali', 'age_limit': 0,
'tags': ['zulkifli hasan', 'menteri perdagangan', 'minyak goreng'], 'tags': ['raja charles', ' raja charles iii', ' ratu elizabeth', ' ratu elizabeth meninggal dunia', ' raja inggris', ' inggris'],
'timestamp': 1657039548, 'release_timestamp': 1662869995,
'upload_date': '20220705' 'release_date': '20220911',
'uploader': 'REUTERS'
} }
}, { }, {
# e-flash # 20.detik
'url': 'https://20.detik.com/e-flash/20220705-220705109/ahli-level-ppkm-jadi-payung-strategi-protokol-kesehatan',
'info_dict': {
'id': '220705109',
'ext': 'mp4',
'tags': ['ppkm jabodetabek', 'dicky budiman', 'ppkm'],
'upload_date': '20220705',
'duration': 110,
'title': 'Ahli: Level PPKM Jadi Payung Strategi Protokol Kesehatan',
'thumbnail': 'https://cdnv.detik.com/videoservice/AdminTV/2022/07/05/Ahli-_Level_PPKM_Jadi_Payung_Strat_jOgUMCN-20220705182313-custom.jpg?w=650&q=80',
'description': 'md5:4eb825a9842e6bdfefd66f47b364314a',
'timestamp': 1657045255,
}
}, {
# otobuzz
'url': 'https://20.detik.com/otobuzz/20220704-220704093/mulai-rp-10-jutaan-ini-skema-kredit-mitsubishi-pajero-sport', 'url': 'https://20.detik.com/otobuzz/20220704-220704093/mulai-rp-10-jutaan-ini-skema-kredit-mitsubishi-pajero-sport',
'info_dict': { 'info_dict': {
'display_id': 'mulai-rp-10-jutaan-ini-skema-kredit-mitsubishi-pajero-sport',
'id': '220704093', 'id': '220704093',
'ext': 'mp4', 'ext': 'mp4',
'tags': ['cicilan mobil', 'mitsubishi pajero sport', 'mitsubishi', 'pajero sport'],
'timestamp': 1656951521,
'duration': 83,
'upload_date': '20220704',
'thumbnail': 'https://cdnv.detik.com/videoservice/AdminTV/2022/07/04/5d6187e402ec4a91877755a5886ff5b6-20220704161859-0s.jpg?w=650&q=80',
'description': 'md5:9b2257341b6f375cdcf90106146d5ffb', 'description': 'md5:9b2257341b6f375cdcf90106146d5ffb',
'thumbnail': r're:https?://cdnv\.detik\.com/videoservice/AdminTV/2022/07/04/5d6187e402ec4a91877755a5886ff5b6-20220704161859-0s.jpg',
'title': 'Mulai Rp 10 Jutaan! Ini Skema Kredit Mitsubishi Pajero Sport', 'title': 'Mulai Rp 10 Jutaan! Ini Skema Kredit Mitsubishi Pajero Sport',
} 'timestamp': 1656951521,
}, {
# sport-buzz
'url': 'https://20.detik.com/sport-buzz/20220704-220704054/crash-crash-horor-di-paruh-pertama-motogp-2022',
'info_dict': {
'id': '220704054',
'ext': 'mp4',
'thumbnail': 'https://cdnv.detik.com/videoservice/AdminTV/2022/07/04/6b172c6fb564411996ea145128315630-20220704090746-0s.jpg?w=650&q=80',
'title': 'Crash-crash Horor di Paruh Pertama MotoGP 2022',
'description': 'md5:fbcc6687572ad7d16eb521b76daa50e4',
'timestamp': 1656925591,
'duration': 107,
'tags': ['marc marquez', 'fabio quartararo', 'francesco bagnaia', 'motogp crash', 'motogp 2022'],
'upload_date': '20220704', 'upload_date': '20220704',
'duration': 83.0,
'tags': ['cicilan mobil', 'mitsubishi pajero sport', 'mitsubishi', 'pajero sport'],
'release_timestamp': 1656926321,
'release_date': '20220704',
'age_limit': 0,
'uploader': 'Ridwan Arifin ' # TODO: strip trailling whitespace at uploader
} }
}, { }, {
# adu-perspektif # pasangmata.detik
'url': 'https://20.detik.com/adu-perspektif/20220518-220518144/24-tahun-reformasi-dan-alarm-demokrasi-dari-filipina', 'url': 'https://pasangmata.detik.com/contribution/366649',
'info_dict': { 'info_dict': {
'id': '220518144', 'id': '366649',
'ext': 'mp4', 'ext': 'mp4',
'title': '24 Tahun Reformasi dan Alarm Demokrasi dari Filipina', 'title': 'Saling Dorong Aparat dan Pendemo di Aksi Tolak Kenaikan BBM',
'upload_date': '20220518', 'description': 'md5:7a6580876c8381c454679e028620bea7',
'timestamp': 1652913823, 'age_limit': 0,
'duration': 185.0, 'tags': 'count:17',
'tags': ['politik', 'adu perspektif', 'indonesia', 'filipina', 'demokrasi'], 'thumbnail': 'https://akcdn.detik.net.id/community/data/media/thumbs-pasangmata/2022/09/08/366649-16626229351533009620.mp4-03.jpg',
'description': 'md5:8eaaf440b839c3d02dca8c9bbbb099a9',
'thumbnail': 'https://cdnv.detik.com/videoservice/AdminTV/2022/05/18/adpers_18_mei_compressed-20220518230458-custom.jpg?w=650&q=80',
} }
}, { }, {
# sosok # insertlive embed
'url': 'https://20.detik.com/sosok/20220702-220703032/resa-boenard-si-princess-bantar-gebang', 'url': 'https://www.insertlive.com/embed/video/290482',
'info_dict': { 'info_dict': {
'id': '220703032', 'id': '290482',
'ext': 'mp4', 'ext': 'mp4',
'timestamp': 1656824438, 'release_timestamp': 1663063704,
'thumbnail': 'https://cdnv.detik.com/videoservice/AdminTV/2022/07/02/SOSOK_BGBJ-20220702191138-custom.jpg?w=650&q=80', 'thumbnail': 'https://akcdn.detik.net.id/visual/2022/09/13/leonardo-dicaprio_169.png?w=600&q=90',
'title': 'Resa Boenard Si \'Princess Bantar Gebang\'', 'age_limit': 0,
'description': 'md5:84ea66306a0285330de6a13fc6218b78', 'description': 'Aktor Leonardo DiCaprio memang baru saja putus dari kekasihnya yang bernama Camilla Morrone.',
'tags': ['sosok', 'sosok20d', 'bantar gebang', 'bgbj', 'resa boenard', 'bantar gebang bgbj', 'bgbj bantar gebang', 'sosok bantar gebang', 'sosok bgbj', 'bgbj resa boenard'], 'release_date': '20220913',
'upload_date': '20220703', 'title': 'Diincar Leonardo DiCaprio, Gigi Hadid Ngaku Tertarik Tapi Belum Cinta',
'duration': 650, 'tags': ['leonardo dicaprio', ' gigi hadid', ' hollywood'],
'uploader': '!nsertlive',
} }
}, { }, {
# viral # beautynesia embed
'url': 'https://20.detik.com/viral/20220603-220603135/merasakan-bus-imut-tanpa-pengemudi-muter-muter-di-kawasan-bsd-city', 'url': 'https://www.beautynesia.id/embed/video/261636',
'info_dict': { 'info_dict': {
'id': '220603135', 'id': '261636',
'ext': 'mp4', 'ext': 'mp4',
'description': 'md5:4771fe101aa303edb829c59c26f9e7c6', 'age_limit': 0,
'timestamp': 1654304305, 'release_timestamp': 1662375600,
'title': 'Merasakan Bus Imut Tanpa Pengemudi, Muter-muter di Kawasan BSD City', 'description': 'Menurut ramalan astrologi, tiga zodiak ini bakal hoki sepanjang September 2022.',
'tags': ['viral', 'autonomous vehicle', 'electric', 'shuttle bus'], 'title': '3 Zodiak Paling Beruntung Selama September 2022',
'thumbnail': 'https://cdnv.detik.com/videoservice/AdminTV/2022/06/03/VIRAL_BUS_NO_SUPIR-20220604004707-custom.jpg?w=650&q=80', 'release_date': '20220905',
'duration': 593, 'tags': ['zodiac update', ' zodiak', ' ramalan bintang', ' zodiak beruntung 2022', ' zodiak hoki september 2022', ' zodiak beruntung september 2022'],
'upload_date': '20220604', 'thumbnail': 'https://akcdn.detik.net.id/visual/2022/09/05/3-zodiak-paling-beruntung-selama-september-2022_169.jpeg?w=600&q=90',
'uploader': 'amh',
}
}, {
# cnbcindonesia embed
'url': 'https://www.cnbcindonesia.com/embed/video/371839',
'info_dict': {
'id': '371839',
'ext': 'mp4',
'title': 'Puluhan Pejabat Rusia Tuntut Putin Mundur',
'tags': ['putin'],
'age_limit': 0,
'thumbnail': 'https://awsimages.detik.net.id/visual/2022/09/13/cnbc-indonesia-tv-3_169.png?w=600&q=80',
'description': 'md5:8b9111e37555fcd95fe549a9b4ae6fdc',
}
}, {
# detik shortlink (we can get it from https://dtk.id/?<url>)
'url': 'https://dtk.id/NkISKr',
'info_dict': {
'id': '220914049',
'ext': 'mp4',
'release_timestamp': 1663114488,
'uploader': 'Tim 20Detik',
'title': 'Pakar Bicara soal Tim Khusus Jokowi dan Mereka yang Pro ke Bjorka',
'age_limit': 0,
'thumbnail': 'https://cdnv.detik.com/videoservice/AdminTV/2022/09/14/f15cae71d7b640c58e75b254ecbb1ce1-20220914071613-0s.jpg?w=400&q=80',
'display_id': 'pakar-bicara-soal-tim-khusus-jokowi-dan-mereka-yang-pro-ke-bjorka',
'upload_date': '20220914',
'release_date': '20220914',
'description': 'md5:5eb03225f7ee40207dd3a1e18a73f1ff',
'timestamp': 1663139688,
'duration': 213.0,
'tags': ['hacker bjorka', 'bjorka', 'hacker bjorka bocorkan data rahasia presiden jokowi', 'jokowi'],
} }
}] }]
def _real_extract(self, url): def _extract_from_webpage(self, url, webpage):
display_id = self._match_id(url) player_type, video_data = self._search_regex(
webpage = self._download_webpage(url, display_id) r'<script\s*[^>]+src="https?://(aws)?cdn\.detik\.net\.id/(?P<type>flowplayer|detikVideo)[^>]+>\s*(?P<video_data>{[^}]+})',
json_ld_data = self._search_json_ld(webpage, display_id) webpage, 'playerjs', group=('type', 'video_data'), default=(None, ''))
if not player_type:
return
video_url = self._html_search_regex( display_id, extra_info_dict = url_basename(url), {}
r'videoUrl\s*:\s*"(?P<video_url>[^"]+)', webpage, 'videoUrl')
formats, subtitles = self._extract_m3u8_formats_and_subtitles(video_url, display_id, ext='mp4')
return merge_dicts(json_ld_data, { if player_type == 'flowplayer':
'id': self._html_search_meta('video_id', webpage), video_json_data = self._parse_json(video_data.replace('\'', '"'), display_id)
video_url = video_json_data['videoUrl']
extra_info_dict = {
'id': self._search_regex(r'identifier\s*:\s*\'([^\']+)', webpage, 'identifier'),
'thumbnail': video_json_data.get('imageUrl'),
}
elif player_type == 'detikVideo':
video_url = self._search_regex(
r'videoUrl\s*:\s*[\'"]?([^"\']+)', video_data, 'videoUrl')
extra_info_dict = {
'id': self._html_search_meta(['video_id', 'dtk:video_id'], webpage),
'thumbnail': self._search_regex(r'imageUrl\s*:\s*[\'"]?([^"\']+)', video_data, 'videoUrl'),
'duration': int_or_none(self._html_search_meta('duration', webpage, fatal=False, default=None)),
'release_timestamp': int_or_none(self._html_search_meta('dtk:publishdateunix', webpage, fatal=False, default=None), 1000),
'timestamp': int_or_none(self._html_search_meta('dtk:createdateunix', webpage, fatal=False, default=None), 1000),
'uploader': self._search_regex(
r'([^-]+)', self._html_search_meta('dtk:author', webpage, default='').strip(), 'uploader',
default=None)
}
formats, subtitles = self._extract_m3u8_formats_and_subtitles(video_url, display_id)
self._sort_formats(formats)
json_ld_data = self._search_json_ld(webpage, display_id, default={})
yield merge_dicts(json_ld_data, extra_info_dict, {
'display_id': display_id,
'title': self._html_search_meta(['og:title', 'originalTitle'], webpage) or self._html_extract_title(webpage),
'description': self._html_search_meta(['og:description', 'twitter:description', 'description'], webpage),
'formats': formats, 'formats': formats,
'subtitles': subtitles, 'subtitles': subtitles,
'tags': str_or_none(self._html_search_meta(['keywords', 'keyword', 'dtk:keywords'], webpage), '').split(','), 'tags': try_call(lambda: self._html_search_meta(
['keywords', 'keyword', 'dtk:keywords'], webpage).split(',')),
}) })

View File

@ -6,7 +6,7 @@
class DoodStreamIE(InfoExtractor): class DoodStreamIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?dood\.(?:to|watch|so|pm)/[ed]/(?P<id>[a-z0-9]+)' _VALID_URL = r'https?://(?:www\.)?dood\.(?:to|watch|so|pm|wf)/[ed]/(?P<id>[a-z0-9]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://dood.to/e/5s1wmbdacezb', 'url': 'http://dood.to/e/5s1wmbdacezb',
'md5': '4568b83b31e13242b3f1ff96c55f0595', 'md5': '4568b83b31e13242b3f1ff96c55f0595',

View File

@ -745,6 +745,45 @@ class MotorTrendIE(DiscoveryPlusBaseIE):
} }
class MotorTrendOnDemandIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:www\.)?motortrendondemand\.com/detail' + DPlayBaseIE._PATH_REGEX
_TESTS = [{
'url': 'https://www.motortrendondemand.com/detail/wheelstanding-dump-truck-stubby-bobs-comeback/37699/784',
'info_dict': {
'id': '37699',
'display_id': 'wheelstanding-dump-truck-stubby-bobs-comeback/37699',
'ext': 'mp4',
'title': 'Wheelstanding Dump Truck! Stubby Bobs Comeback',
'description': 'md5:996915abe52a1c3dfc83aecea3cce8e7',
'season_number': 5,
'episode_number': 52,
'episode': 'Episode 52',
'season': 'Season 5',
'thumbnail': r're:^https?://.+\.jpe?g$',
'timestamp': 1388534401,
'duration': 1887.345,
'creator': 'Originals',
'series': 'Roadkill',
'upload_date': '20140101',
'tags': [],
},
}]
_PRODUCT = 'MTOD'
_DISCO_API_PARAMS = {
'disco_host': 'us1-prod-direct.motortrendondemand.com',
'realm': 'motortrend',
'country': 'us',
}
def _update_disco_api_headers(self, headers, disco_base, display_id, realm):
headers.update({
'x-disco-params': f'realm={realm}',
'x-disco-client': f'WEB:UNKNOWN:{self._PRODUCT}:4.39.1-gi1',
'Authorization': self._get_auth(disco_base, display_id, realm),
})
class DiscoveryPlusIE(DiscoveryPlusBaseIE): class DiscoveryPlusIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:www\.)?discoveryplus\.com/(?!it/)(?:\w{2}/)?video' + DPlayBaseIE._PATH_REGEX _VALID_URL = r'https?://(?:www\.)?discoveryplus\.com/(?!it/)(?:\w{2}/)?video' + DPlayBaseIE._PATH_REGEX
_TESTS = [{ _TESTS = [{
@ -907,6 +946,9 @@ class DiscoveryPlusItalyIE(DiscoveryPlusBaseIE):
_TESTS = [{ _TESTS = [{
'url': 'https://www.discoveryplus.com/it/video/i-signori-della-neve/stagione-2-episodio-1-i-preparativi', 'url': 'https://www.discoveryplus.com/it/video/i-signori-della-neve/stagione-2-episodio-1-i-preparativi',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.discoveryplus.com/it/video/super-benny/trailer',
'only_matching': True,
}] }]
_PRODUCT = 'dplus_us' _PRODUCT = 'dplus_us'
@ -916,6 +958,13 @@ class DiscoveryPlusItalyIE(DiscoveryPlusBaseIE):
'country': 'it', 'country': 'it',
} }
def _update_disco_api_headers(self, headers, disco_base, display_id, realm):
headers.update({
'x-disco-params': 'realm=%s' % realm,
'x-disco-client': f'WEB:UNKNOWN:{self._PRODUCT}:25.2.6',
'Authorization': self._get_auth(disco_base, display_id, realm),
})
class DiscoveryPlusItalyShowIE(DiscoveryPlusShowBaseIE): class DiscoveryPlusItalyShowIE(DiscoveryPlusShowBaseIE):
_VALID_URL = r'https?://(?:www\.)?discoveryplus\.it/programmi/(?P<show_name>[^/]+)/?(?:[?#]|$)' _VALID_URL = r'https?://(?:www\.)?discoveryplus\.it/programmi/(?P<show_name>[^/]+)/?(?:[?#]|$)'

View File

@ -54,7 +54,7 @@ def _real_extract(self, url):
raise ExtractorError('Password protected video, use --video-password <password>', expected=True) raise ExtractorError('Password protected video, use --video-password <password>', expected=True)
info_json = self._search_json(r'InitReact\.mountComponent\(.*?,', webpage, 'mountComponent', video_id, info_json = self._search_json(r'InitReact\.mountComponent\(.*?,', webpage, 'mountComponent', video_id,
contains_pattern=r'.+?"preview".+?', end_pattern=r'\)')['props'] contains_pattern=r'{.+?"preview".+?}', end_pattern=r'\)')['props']
transcode_url = traverse_obj(info_json, ((None, 'preview'), 'file', 'preview', 'content', 'transcode_url'), get_all=False) transcode_url = traverse_obj(info_json, ((None, 'preview'), 'file', 'preview', 'content', 'transcode_url'), get_all=False)
formats, subtitles = self._extract_m3u8_formats_and_subtitles(transcode_url, video_id) formats, subtitles = self._extract_m3u8_formats_and_subtitles(transcode_url, video_id)

46
yt_dlp/extractor/epoch.py Normal file
View File

@ -0,0 +1,46 @@
from .common import InfoExtractor
class EpochIE(InfoExtractor):
_VALID_URL = r'https?://www.theepochtimes\.com/[\w-]+_(?P<id>\d+).html'
_TESTS = [
{
'url': 'https://www.theepochtimes.com/they-can-do-audio-video-physical-surveillance-on-you-24h-365d-a-year-rex-lee-on-intrusive-apps_4661688.html',
'info_dict': {
'id': 'a3dd732c-4750-4bc8-8156-69180668bda1',
'ext': 'mp4',
'title': 'They Can Do Audio, Video, Physical Surveillance on You 24H/365D a Year: Rex Lee on Intrusive Apps',
}
},
{
'url': 'https://www.theepochtimes.com/the-communist-partys-cyberattacks-on-america-explained-rex-lee-talks-tech-hybrid-warfare_4342413.html',
'info_dict': {
'id': '276c7f46-3bbf-475d-9934-b9bbe827cf0a',
'ext': 'mp4',
'title': 'The Communist Partys Cyberattacks on America Explained; Rex Lee Talks Tech Hybrid Warfare',
}
},
{
'url': 'https://www.theepochtimes.com/kash-patel-a-6-year-saga-of-government-corruption-from-russiagate-to-mar-a-lago_4690250.html',
'info_dict': {
'id': 'aa9ceecd-a127-453d-a2de-7153d6fd69b6',
'ext': 'mp4',
'title': 'Kash Patel: A 6-Year-Saga of Government Corruption, From Russiagate to Mar-a-Lago',
}
},
]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
youmaker_video_id = self._search_regex(r'data-trailer="[\w-]+" data-id="([\w-]+)"', webpage, 'url')
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
f'http://vs1.youmaker.com/assets/{youmaker_video_id}/playlist.m3u8', video_id, 'mp4', m3u8_id='hls')
return {
'id': youmaker_video_id,
'formats': formats,
'subtitles': subtitles,
'title': self._html_extract_title(webpage)
}

View File

@ -0,0 +1,99 @@
from .common import InfoExtractor
from ..utils import traverse_obj
class EurosportIE(InfoExtractor):
_VALID_URL = r'https?://www\.eurosport\.com/\w+/[\w-]+/\d+/[\w-]+_(?P<id>vid\d+)'
_TESTS = [{
'url': 'https://www.eurosport.com/tennis/roland-garros/2022/highlights-rafael-nadal-brushes-aside-caper-ruud-to-win-record-extending-14th-french-open-title_vid1694147/video.shtml',
'info_dict': {
'id': '2480939',
'ext': 'mp4',
'title': 'Highlights: Rafael Nadal brushes aside Caper Ruud to win record-extending 14th French Open title',
'description': 'md5:b564db73ecfe4b14ebbd8e62a3692c76',
'thumbnail': 'https://imgresizer.eurosport.com/unsafe/1280x960/smart/filters:format(jpeg)/origin-imgresizer.eurosport.com/2022/06/05/3388285-69245968-2560-1440.png',
'duration': 195.0,
'display_id': 'vid1694147',
'timestamp': 1654446698,
'upload_date': '20220605',
}
}, {
'url': 'https://www.eurosport.com/tennis/roland-garros/2022/watch-the-top-five-shots-from-men-s-final-as-rafael-nadal-beats-casper-ruud-to-seal-14th-french-open_vid1694283/video.shtml',
'info_dict': {
'id': '2481254',
'ext': 'mp4',
'title': 'md5:149dcc5dfb38ab7352acc008cc9fb071',
'duration': 130.0,
'thumbnail': 'https://imgresizer.eurosport.com/unsafe/1280x960/smart/filters:format(jpeg)/origin-imgresizer.eurosport.com/2022/06/05/3388422-69248708-2560-1440.png',
'description': 'md5:a0c8a7f6b285e48ae8ddbe7aa85cfee6',
'display_id': 'vid1694283',
'timestamp': 1654456090,
'upload_date': '20220605',
}
}, {
# geo-fence but can bypassed by xff
'url': 'https://www.eurosport.com/cycling/tour-de-france-femmes/2022/incredible-ride-marlen-reusser-storms-to-stage-4-win-at-tour-de-france-femmes_vid1722221/video.shtml',
'info_dict': {
'id': '2582552',
'ext': 'mp4',
'title': 'Incredible ride! - Marlen Reusser storms to Stage 4 win at Tour de France Femmes',
'duration': 188.0,
'display_id': 'vid1722221',
'timestamp': 1658936167,
'thumbnail': 'https://imgresizer.eurosport.com/unsafe/1280x960/smart/filters:format(jpeg)/origin-imgresizer.eurosport.com/2022/07/27/3423347-69852108-2560-1440.jpg',
'description': 'md5:32bbe3a773ac132c57fb1e8cca4b7c71',
'upload_date': '20220727',
}
}]
_TOKEN = None
# actually defined in https://netsport.eurosport.io/?variables={"databaseId":<databaseId>,"playoutType":"VDP"}&extensions={"persistedQuery":{"version":1 ..
# but this method require to get sha256 hash
_GEO_COUNTRIES = ['DE', 'NL', 'EU', 'IT', 'FR'] # Not complete list but it should work
def _real_initialize(self):
if EurosportIE._TOKEN is None:
EurosportIE._TOKEN = self._download_json(
'https://eu3-prod-direct.eurosport.com/token?realm=eurosport', None,
'Trying to get token')['data']['attributes']['token']
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
json_data = self._download_json(
f'https://eu3-prod-direct.eurosport.com/playback/v2/videoPlaybackInfo/sourceSystemId/eurosport-{display_id}',
display_id, query={'usePreAuth': True}, headers={'Authorization': f'Bearer {EurosportIE._TOKEN}'})['data']
json_ld_data = self._search_json_ld(webpage, display_id)
formats, subtitles = [], {}
for stream_type in json_data['attributes']['streaming']:
if stream_type == 'hls':
fmts, subs = self._extract_m3u8_formats_and_subtitles(
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id, ext='mp4')
elif stream_type == 'dash':
fmts, subs = self._extract_mpd_formats_and_subtitles(
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id)
elif stream_type == 'mss':
fmts, subs = self._extract_ism_formats_and_subtitles(
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
self._sort_formats(formats)
return {
'id': json_data['id'],
'title': json_ld_data.get('title') or self._og_search_title(webpage),
'display_id': display_id,
'formats': formats,
'subtitles': subtitles,
'thumbnails': json_ld_data.get('thumbnails'),
'description': (json_ld_data.get('description')
or self._html_search_meta(['og:description', 'description'], webpage)),
'duration': json_ld_data.get('duration'),
'timestamp': json_ld_data.get('timestamp'),
}

View File

@ -3,6 +3,9 @@
from ..utils import load_plugins from ..utils import load_plugins
# NB: Must be before other imports so that plugins can be correctly injected
_PLUGIN_CLASSES = load_plugins('extractor', 'IE', {})
_LAZY_LOADER = False _LAZY_LOADER = False
if not os.environ.get('YTDLP_NO_LAZY_EXTRACTORS'): if not os.environ.get('YTDLP_NO_LAZY_EXTRACTORS'):
with contextlib.suppress(ImportError): with contextlib.suppress(ImportError):
@ -19,5 +22,5 @@
] ]
_ALL_CLASSES.append(GenericIE) # noqa: F405 _ALL_CLASSES.append(GenericIE) # noqa: F405
_PLUGIN_CLASSES = load_plugins('extractor', 'IE', globals()) globals().update(_PLUGIN_CLASSES)
_ALL_CLASSES = list(_PLUGIN_CLASSES.values()) + _ALL_CLASSES _ALL_CLASSES[:0] = _PLUGIN_CLASSES.values()

View File

@ -772,3 +772,30 @@ def _real_extract(self, url):
if not redirect_url: if not redirect_url:
raise ExtractorError('Invalid facebook redirect URL', expected=True) raise ExtractorError('Invalid facebook redirect URL', expected=True)
return self.url_result(redirect_url) return self.url_result(redirect_url)
class FacebookReelIE(InfoExtractor):
_VALID_URL = r'https?://(?:[\w-]+\.)?facebook\.com/reel/(?P<id>\d+)'
IE_NAME = 'facebook:reel'
_TESTS = [{
'url': 'https://www.facebook.com/reel/1195289147628387',
'md5': 'c4ff9a7182ff9ff7d6f7a83603bae831',
'info_dict': {
'id': '1195289147628387',
'ext': 'mp4',
'title': 'md5:9f5b142921b2dc57004fa13f76005f87',
'description': 'md5:24ea7ef062215d295bdde64e778f5474',
'uploader': 'Beast Camp Training',
'uploader_id': '1738535909799870',
'duration': 9.536,
'thumbnail': r're:^https?://.*',
'upload_date': '20211121',
'timestamp': 1637502604,
}
}]
def _real_extract(self, url):
video_id = self._match_id(url)
return self.url_result(
f'https://m.facebook.com/watch/?v={video_id}&_rdr', FacebookIE, video_id)

View File

@ -12,8 +12,10 @@
int_or_none, int_or_none,
parse_age_limit, parse_age_limit,
parse_duration, parse_duration,
traverse_obj,
try_get, try_get,
unified_timestamp, unified_timestamp,
url_or_none,
) )
@ -34,7 +36,8 @@ class FOXIE(InfoExtractor):
'creator': 'FOX', 'creator': 'FOX',
'series': 'Gotham', 'series': 'Gotham',
'age_limit': 14, 'age_limit': 14,
'episode': 'Aftermath: Bruce Wayne Develops Into The Dark Knight' 'episode': 'Aftermath: Bruce Wayne Develops Into The Dark Knight',
'thumbnail': r're:^https?://.*\.jpg$',
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@ -165,6 +168,7 @@ def _real_extract(self, url):
'season_number': int_or_none(video.get('seasonNumber')), 'season_number': int_or_none(video.get('seasonNumber')),
'episode': video.get('name'), 'episode': video.get('name'),
'episode_number': int_or_none(video.get('episodeNumber')), 'episode_number': int_or_none(video.get('episodeNumber')),
'thumbnail': traverse_obj(video, ('images', 'still', 'raw'), expected_type=url_or_none),
'release_year': int_or_none(video.get('releaseYear')), 'release_year': int_or_none(video.get('releaseYear')),
'subtitles': subtitles, 'subtitles': subtitles,
} }

View File

@ -1,9 +1,9 @@
import os import os
import re import re
import types
import urllib.parse import urllib.parse
import xml.etree.ElementTree import xml.etree.ElementTree
from . import gen_extractor_classes
from .common import InfoExtractor # isort: split from .common import InfoExtractor # isort: split
from .brightcove import BrightcoveLegacyIE, BrightcoveNewIE from .brightcove import BrightcoveLegacyIE, BrightcoveNewIE
from .commonprotocols import RtmpIE from .commonprotocols import RtmpIE
@ -26,11 +26,13 @@
parse_resolution, parse_resolution,
smuggle_url, smuggle_url,
str_or_none, str_or_none,
traverse_obj,
try_call, try_call,
unescapeHTML, unescapeHTML,
unified_timestamp, unified_timestamp,
unsmuggle_url, unsmuggle_url,
url_or_none, url_or_none,
variadic,
xpath_attr, xpath_attr,
xpath_text, xpath_text,
xpath_with_ns, xpath_with_ns,
@ -873,22 +875,6 @@ class GenericIE(InfoExtractor):
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
}, },
}, },
# Wistia embed
{
'url': 'http://study.com/academy/lesson/north-american-exploration-failed-colonies-of-spain-france-england.html#lesson',
'md5': '1953f3a698ab51cfc948ed3992a0b7ff',
'info_dict': {
'id': '6e2wtrbdaf',
'ext': 'mov',
'title': 'paywall_north-american-exploration-failed-colonies-of-spain-france-england',
'description': 'a Paywall Videos video from Remilon',
'duration': 644.072,
'uploader': 'study.com',
'timestamp': 1459678540,
'upload_date': '20160403',
'filesize': 24687186,
},
},
# Wistia standard embed (async) # Wistia standard embed (async)
{ {
'url': 'https://www.getdrip.com/university/brennan-dunn-drip-workshop/', 'url': 'https://www.getdrip.com/university/brennan-dunn-drip-workshop/',
@ -903,7 +889,8 @@ class GenericIE(InfoExtractor):
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
} },
'skip': 'webpage 404 not found',
}, },
# Soundcloud embed # Soundcloud embed
{ {
@ -1086,18 +1073,6 @@ class GenericIE(InfoExtractor):
'skip_download': True, 'skip_download': True,
} }
}, },
{
# JWPlatform iframe
'url': 'https://www.covermagazine.co.uk/feature/2465255/business-protection-involved',
'info_dict': {
'id': 'AG26UQXM',
'ext': 'mp4',
'upload_date': '20160719',
'timestamp': 468923808,
'title': '2016_05_18 Cover L&G Business Protection V1 FINAL.mp4',
},
'add_ie': ['JWPlatform'],
},
{ {
# Video.js embed, multiple formats # Video.js embed, multiple formats
'url': 'http://ortcam.com/solidworks-урок-6-настройка-чертежа_33f9b7351.html', 'url': 'http://ortcam.com/solidworks-урок-6-настройка-чертежа_33f9b7351.html',
@ -2006,22 +1981,6 @@ class GenericIE(InfoExtractor):
}, },
'playlist_count': 6, 'playlist_count': 6,
}, },
{
# Squarespace video embed, 2019-08-28
'url': 'http://ootboxford.com',
'info_dict': {
'id': 'Tc7b_JGdZfw',
'title': 'Out of the Blue, at Childish Things 10',
'ext': 'mp4',
'description': 'md5:a83d0026666cf5ee970f8bd1cfd69c7f',
'uploader_id': 'helendouglashouse',
'uploader': 'Helen & Douglas House',
'upload_date': '20140328',
},
'params': {
'skip_download': True,
},
},
# { # {
# # Zype embed # # Zype embed
# 'url': 'https://www.cookscountry.com/episode/554-smoky-barbecue-favorites', # 'url': 'https://www.cookscountry.com/episode/554-smoky-barbecue-favorites',
@ -2490,6 +2449,21 @@ class GenericIE(InfoExtractor):
'duration': 111.0, 'duration': 111.0,
} }
}, },
{
'note': 'JSON LD with unexpected data type',
'url': 'https://www.autoweek.nl/autotests/artikel/porsche-911-gt3-rs-rij-impressie-2/',
'info_dict': {
'id': 'porsche-911-gt3-rs-rij-impressie-2',
'ext': 'mp4',
'title': 'Test: Porsche 911 GT3 RS',
'description': 'Je ziet het niet, maar het is er wel. Downforce, hebben we het dan over. En in de nieuwe Porsche 911 GT3 RS is er zelfs heel veel downforce.',
'timestamp': 1664920902,
'upload_date': '20221004',
'thumbnail': r're:^https://media.autoweek.nl/m/.+\.jpg$',
'age_limit': 0,
'direct': True,
}
}
] ]
def report_following_redirect(self, new_url): def report_following_redirect(self, new_url):
@ -2621,10 +2595,11 @@ def _real_extract(self, url):
default_search += ':' default_search += ':'
return self.url_result(default_search + url) return self.url_result(default_search + url)
url, smuggled_data = unsmuggle_url(url) original_url = url
url, smuggled_data = unsmuggle_url(url, {})
force_videoid = None force_videoid = None
is_intentional = smuggled_data and smuggled_data.get('to_generic') is_intentional = smuggled_data.get('to_generic')
if smuggled_data and 'force_videoid' in smuggled_data: if 'force_videoid' in smuggled_data:
force_videoid = smuggled_data['force_videoid'] force_videoid = smuggled_data['force_videoid']
video_id = force_videoid video_id = force_videoid
else: else:
@ -2638,7 +2613,10 @@ def _real_extract(self, url):
# to accept raw bytes and being able to download only a chunk. # to accept raw bytes and being able to download only a chunk.
# It may probably better to solve this by checking Content-Type for application/octet-stream # It may probably better to solve this by checking Content-Type for application/octet-stream
# after a HEAD request, but not sure if we can rely on this. # after a HEAD request, but not sure if we can rely on this.
full_response = self._request_webpage(url, video_id, headers={'Accept-Encoding': '*'}) full_response = self._request_webpage(url, video_id, headers={
'Accept-Encoding': '*',
**smuggled_data.get('http_headers', {})
})
new_url = full_response.geturl() new_url = full_response.geturl()
if url != new_url: if url != new_url:
self.report_following_redirect(new_url) self.report_following_redirect(new_url)
@ -2657,14 +2635,15 @@ def _real_extract(self, url):
m = re.match(r'^(?P<type>audio|video|application(?=/(?:ogg$|(?:vnd\.apple\.|x-)?mpegurl)))/(?P<format_id>[^;\s]+)', content_type) m = re.match(r'^(?P<type>audio|video|application(?=/(?:ogg$|(?:vnd\.apple\.|x-)?mpegurl)))/(?P<format_id>[^;\s]+)', content_type)
if m: if m:
self.report_detected('direct video link') self.report_detected('direct video link')
headers = smuggled_data.get('http_headers', {})
format_id = str(m.group('format_id')) format_id = str(m.group('format_id'))
subtitles = {} subtitles = {}
if format_id.endswith('mpegurl'): if format_id.endswith('mpegurl'):
formats, subtitles = self._extract_m3u8_formats_and_subtitles(url, video_id, 'mp4') formats, subtitles = self._extract_m3u8_formats_and_subtitles(url, video_id, 'mp4', headers=headers)
elif format_id.endswith('mpd') or format_id.endswith('dash+xml'): elif format_id.endswith('mpd') or format_id.endswith('dash+xml'):
formats, subtitles = self._extract_mpd_formats_and_subtitles(url, video_id) formats, subtitles = self._extract_mpd_formats_and_subtitles(url, video_id, headers=headers)
elif format_id == 'f4m': elif format_id == 'f4m':
formats = self._extract_f4m_formats(url, video_id) formats = self._extract_f4m_formats(url, video_id, headers=headers)
else: else:
formats = [{ formats = [{
'format_id': format_id, 'format_id': format_id,
@ -2673,8 +2652,11 @@ def _real_extract(self, url):
}] }]
info_dict['direct'] = True info_dict['direct'] = True
self._sort_formats(formats) self._sort_formats(formats)
info_dict['formats'] = formats info_dict.update({
info_dict['subtitles'] = subtitles 'formats': formats,
'subtitles': subtitles,
'http_headers': headers,
})
return info_dict return info_dict
if not self.get_param('test', False) and not is_intentional: if not self.get_param('test', False) and not is_intentional:
@ -2765,7 +2747,20 @@ def _real_extract(self, url):
'age_limit': self._rta_search(webpage), 'age_limit': self._rta_search(webpage),
}) })
domain_name = self._search_regex(r'^(?:https?://)?([^/]*)/.*', url, 'video uploader') self._downloader.write_debug('Looking for embeds')
embeds = list(self._extract_embeds(original_url, webpage, urlh=full_response, info_dict=info_dict))
if len(embeds) == 1:
return {**info_dict, **embeds[0]}
elif embeds:
return self.playlist_result(embeds, **info_dict)
raise UnsupportedError(url)
def _extract_embeds(self, url, webpage, *, urlh=None, info_dict={}):
"""Returns an iterator of video entries"""
info_dict = types.MappingProxyType(info_dict) # Prevents accidental mutation
video_id = traverse_obj(info_dict, 'display_id', 'id') or self._generic_id(url)
url, smuggled_data = unsmuggle_url(url, {})
actual_url = urlh.geturl() if urlh else url
# Sometimes embedded video player is hidden behind percent encoding # Sometimes embedded video player is hidden behind percent encoding
# (e.g. https://github.com/ytdl-org/youtube-dl/issues/2448) # (e.g. https://github.com/ytdl-org/youtube-dl/issues/2448)
@ -2774,38 +2769,20 @@ def _real_extract(self, url):
# There probably should be a second run of generic extractor on unescaped webpage. # There probably should be a second run of generic extractor on unescaped webpage.
# webpage = urllib.parse.unquote(webpage) # webpage = urllib.parse.unquote(webpage)
# Unescape squarespace embeds to be detected by generic extractor,
# see https://github.com/ytdl-org/youtube-dl/issues/21294
webpage = re.sub(
r'<div[^>]+class=[^>]*?\bsqs-video-wrapper\b[^>]*>',
lambda x: unescapeHTML(x.group(0)), webpage)
# TODO: Move to respective extractors # TODO: Move to respective extractors
self._downloader.write_debug('Looking for Brightcove embeds')
bc_urls = BrightcoveLegacyIE._extract_brightcove_urls(webpage) bc_urls = BrightcoveLegacyIE._extract_brightcove_urls(webpage)
if bc_urls: if bc_urls:
entries = [{ return [self.url_result(smuggle_url(bc_url, {'Referer': url}), BrightcoveLegacyIE)
'_type': 'url', for bc_url in bc_urls]
'url': smuggle_url(bc_url, {'Referer': url}),
'ie_key': 'BrightcoveLegacy'
} for bc_url in bc_urls]
return {
'_type': 'playlist',
'title': info_dict['title'],
'id': video_id,
'entries': entries,
}
bc_urls = BrightcoveNewIE._extract_brightcove_urls(self, webpage) bc_urls = BrightcoveNewIE._extract_brightcove_urls(self, webpage)
if bc_urls: if bc_urls:
return self.playlist_from_matches( return [self.url_result(smuggle_url(bc_url, {'Referer': url}), BrightcoveNewIE)
bc_urls, video_id, info_dict['title'], for bc_url in bc_urls]
getter=lambda x: smuggle_url(x, {'referrer': url}),
ie='BrightcoveNew')
self._downloader.write_debug('Looking for embeds')
embeds = [] embeds = []
for ie in gen_extractor_classes(): for ie in self._downloader._ies.values():
if ie.ie_key() in smuggled_data.get('block_ies', []):
continue
gen = ie.extract_from_webpage(self._downloader, url, webpage) gen = ie.extract_from_webpage(self._downloader, url, webpage)
current_embeds = [] current_embeds = []
try: try:
@ -2814,34 +2791,26 @@ def _real_extract(self, url):
except self.StopExtraction: except self.StopExtraction:
self.report_detected(f'{ie.IE_NAME} exclusive embed', len(current_embeds), self.report_detected(f'{ie.IE_NAME} exclusive embed', len(current_embeds),
embeds and 'discarding other embeds') embeds and 'discarding other embeds')
embeds = current_embeds return current_embeds
break
except StopIteration: except StopIteration:
self.report_detected(f'{ie.IE_NAME} embed', len(current_embeds)) self.report_detected(f'{ie.IE_NAME} embed', len(current_embeds))
embeds.extend(current_embeds) embeds.extend(current_embeds)
del current_embeds if embeds:
if len(embeds) == 1: return embeds
return {**info_dict, **embeds[0]}
elif embeds:
return self.playlist_result(embeds, **info_dict)
jwplayer_data = self._find_jwplayer_data( jwplayer_data = self._find_jwplayer_data(
webpage, video_id, transform_source=js_to_json) webpage, video_id, transform_source=js_to_json)
if jwplayer_data: if jwplayer_data:
if isinstance(jwplayer_data.get('playlist'), str): if isinstance(jwplayer_data.get('playlist'), str):
self.report_detected('JW Player playlist') self.report_detected('JW Player playlist')
return { return [self.url_result(jwplayer_data['playlist'], 'JWPlatform')]
**info_dict,
'_type': 'url',
'ie_key': 'JWPlatform',
'url': jwplayer_data['playlist'],
}
try: try:
info = self._parse_jwplayer_data( info = self._parse_jwplayer_data(
jwplayer_data, video_id, require_title=False, base_url=url) jwplayer_data, video_id, require_title=False, base_url=url)
if traverse_obj(info, 'formats', ('entries', ..., 'formats')):
self.report_detected('JW Player data') self.report_detected('JW Player data')
return merge_dicts(info, info_dict) return [info]
except ExtractorError: except ExtractorError:
# See https://github.com/ytdl-org/youtube-dl/pull/16735 # See https://github.com/ytdl-org/youtube-dl/pull/16735
pass pass
@ -2852,11 +2821,8 @@ def _real_extract(self, url):
webpage) webpage)
if mobj is not None: if mobj is not None:
varname = mobj.group(1) varname = mobj.group(1)
sources = self._parse_json( sources = variadic(self._parse_json(
mobj.group(2), video_id, transform_source=js_to_json, mobj.group(2), video_id, transform_source=js_to_json, fatal=False) or [])
fatal=False) or []
if not isinstance(sources, list):
sources = [sources]
formats = [] formats = []
subtitles = {} subtitles = {}
for source in sources: for source in sources:
@ -2869,7 +2835,7 @@ def _real_extract(self, url):
src_type = src_type.lower() src_type = src_type.lower()
ext = determine_ext(src).lower() ext = determine_ext(src).lower()
if src_type == 'video/youtube': if src_type == 'video/youtube':
return self.url_result(src, YoutubeIE.ie_key()) return [self.url_result(src, YoutubeIE.ie_key())]
if src_type == 'application/dash+xml' or ext == 'mpd': if src_type == 'application/dash+xml' or ext == 'mpd':
fmts, subs = self._extract_mpd_formats_and_subtitles( fmts, subs = self._extract_mpd_formats_and_subtitles(
src, video_id, mpd_id='dash', fatal=False) src, video_id, mpd_id='dash', fatal=False)
@ -2887,7 +2853,7 @@ def _real_extract(self, url):
'ext': (mimetype2ext(src_type) 'ext': (mimetype2ext(src_type)
or ext if ext in KNOWN_EXTENSIONS else 'mp4'), or ext if ext in KNOWN_EXTENSIONS else 'mp4'),
'http_headers': { 'http_headers': {
'Referer': full_response.geturl(), 'Referer': actual_url,
}, },
}) })
# https://docs.videojs.com/player#addRemoteTextTrack # https://docs.videojs.com/player#addRemoteTextTrack
@ -2902,24 +2868,26 @@ def _real_extract(self, url):
'url': urllib.parse.urljoin(url, src), 'url': urllib.parse.urljoin(url, src),
'name': sub.get('label'), 'name': sub.get('label'),
'http_headers': { 'http_headers': {
'Referer': full_response.geturl(), 'Referer': actual_url,
}, },
}) })
if formats or subtitles: if formats or subtitles:
self.report_detected('video.js embed') self.report_detected('video.js embed')
self._sort_formats(formats) self._sort_formats(formats)
info_dict['formats'] = formats return [{'formats': formats, 'subtitles': subtitles}]
info_dict['subtitles'] = subtitles
return info_dict
# Looking for http://schema.org/VideoObject # Looking for http://schema.org/VideoObject
json_ld = self._search_json_ld(webpage, video_id, default={}) json_ld = self._search_json_ld(webpage, video_id, default={})
if json_ld.get('url') not in (url, None): if json_ld.get('url') not in (url, None):
self.report_detected('JSON LD') self.report_detected('JSON LD')
return merge_dicts({ return [merge_dicts({
'_type': 'url_transparent', '_type': 'video' if json_ld.get('ext') else 'url_transparent',
'url': smuggle_url(json_ld['url'], {'force_videoid': video_id, 'to_generic': True}), 'url': smuggle_url(json_ld['url'], {
}, json_ld, info_dict) 'force_videoid': video_id,
'to_generic': True,
'http_headers': {'Referer': url},
}),
}, json_ld)]
def check_video(vurl): def check_video(vurl):
if YoutubeIE.suitable(vurl): if YoutubeIE.suitable(vurl):
@ -2990,13 +2958,13 @@ def filter_video(urls):
self._sort_formats(formats) self._sort_formats(formats)
return { return [{
'id': flashvars['video_id'], 'id': flashvars['video_id'],
'display_id': display_id, 'display_id': display_id,
'title': title, 'title': title,
'thumbnail': thumbnail, 'thumbnail': thumbnail,
'formats': formats, 'formats': formats,
} }]
if not found: if not found:
# Broaden the search a little bit # Broaden the search a little bit
found = filter_video(re.findall(r'[^A-Za-z0-9]?(?:file|source)=(http[^\'"&]*)', webpage)) found = filter_video(re.findall(r'[^A-Za-z0-9]?(?:file|source)=(http[^\'"&]*)', webpage))
@ -3035,7 +3003,7 @@ def filter_video(urls):
self.report_detected('Twitter card') self.report_detected('Twitter card')
if not found: if not found:
# We look for Open Graph info: # We look for Open Graph info:
# We have to match any number spaces between elements, some sites try to align them (eg.: statigr.am) # We have to match any number spaces between elements, some sites try to align them, e.g.: statigr.am
m_video_type = re.findall(r'<meta.*?property="og:video:type".*?content="video/(.*?)"', webpage) m_video_type = re.findall(r'<meta.*?property="og:video:type".*?content="video/(.*?)"', webpage)
# We only look in og:video if the MIME type is a video, don't try if it's a Flash player: # We only look in og:video if the MIME type is a video, don't try if it's a Flash player:
if m_video_type is not None: if m_video_type is not None:
@ -3050,17 +3018,14 @@ def filter_video(urls):
webpage) webpage)
if not found: if not found:
# Look also in Refresh HTTP header # Look also in Refresh HTTP header
refresh_header = full_response.headers.get('Refresh') refresh_header = urlh and urlh.headers.get('Refresh')
if refresh_header: if refresh_header:
found = re.search(REDIRECT_REGEX, refresh_header) found = re.search(REDIRECT_REGEX, refresh_header)
if found: if found:
new_url = urllib.parse.urljoin(url, unescapeHTML(found.group(1))) new_url = urllib.parse.urljoin(url, unescapeHTML(found.group(1)))
if new_url != url: if new_url != url:
self.report_following_redirect(new_url) self.report_following_redirect(new_url)
return { return [self.url_result(new_url)]
'_type': 'url',
'url': new_url,
}
else: else:
found = None found = None
@ -3071,10 +3036,12 @@ def filter_video(urls):
embed_url = self._html_search_meta('twitter:player', webpage, default=None) embed_url = self._html_search_meta('twitter:player', webpage, default=None)
if embed_url and embed_url != url: if embed_url and embed_url != url:
self.report_detected('twitter:player iframe') self.report_detected('twitter:player iframe')
return self.url_result(embed_url) return [self.url_result(embed_url)]
if not found: if not found:
raise UnsupportedError(url) return []
domain_name = self._search_regex(r'^(?:https?://)?([^/]*)/.*', url, 'video uploader', default=None)
entries = [] entries = []
for video_url in orderedSet(found): for video_url in orderedSet(found):
@ -3090,7 +3057,7 @@ def filter_video(urls):
video_id = os.path.splitext(video_id)[0] video_id = os.path.splitext(video_id)[0]
headers = { headers = {
'referer': full_response.geturl() 'referer': actual_url
} }
entry_info_dict = { entry_info_dict = {
@ -3114,7 +3081,7 @@ def filter_video(urls):
if ext == 'smil': if ext == 'smil':
entry_info_dict = {**self._extract_smil_info(video_url, video_id), **entry_info_dict} entry_info_dict = {**self._extract_smil_info(video_url, video_id), **entry_info_dict}
elif ext == 'xspf': elif ext == 'xspf':
return self.playlist_result(self._extract_xspf_playlist(video_url, video_id), video_id) return [self._extract_xspf_playlist(video_url, video_id)]
elif ext == 'm3u8': elif ext == 'm3u8':
entry_info_dict['formats'], entry_info_dict['subtitles'] = self._extract_m3u8_formats_and_subtitles(video_url, video_id, ext='mp4', headers=headers) entry_info_dict['formats'], entry_info_dict['subtitles'] = self._extract_m3u8_formats_and_subtitles(video_url, video_id, ext='mp4', headers=headers)
elif ext == 'mpd': elif ext == 'mpd':
@ -3144,14 +3111,9 @@ def filter_video(urls):
entries.append(entry_info_dict) entries.append(entry_info_dict)
if len(entries) == 1: if len(entries) > 1:
return merge_dicts(entries[0], info_dict)
else:
for num, e in enumerate(entries, start=1): for num, e in enumerate(entries, start=1):
# 'url' results don't have a title # 'url' results don't have a title
if e.get('title') is not None: if e.get('title') is not None:
e['title'] = '%s (%d)' % (e['title'], num) e['title'] = '%s (%d)' % (e['title'], num)
return { return entries
'_type': 'playlist',
'entries': entries,
}

View File

@ -1,5 +1,8 @@
import re
import urllib.parse
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import make_archive_id from ..utils import make_archive_id, unescapeHTML
class HTML5MediaEmbedIE(InfoExtractor): class HTML5MediaEmbedIE(InfoExtractor):
@ -29,3 +32,84 @@ def _extract_from_webpage(self, url, webpage):
}) })
self._sort_formats(entry['formats']) self._sort_formats(entry['formats'])
yield entry yield entry
class QuotedHTMLIE(InfoExtractor):
"""For common cases of quoted/escaped html parts in the webpage"""
_VALID_URL = False
IE_NAME = 'generic:quoted-html'
IE_DESC = False # Do not list
_WEBPAGE_TESTS = [{
# 2 YouTube embeds in data-html
'url': 'https://24tv.ua/bronetransporteri-ozbroyenni-zsu-shho-vidomo-pro-bronovik-wolfhound_n2167966',
'info_dict': {
'id': 'bronetransporteri-ozbroyenni-zsu-shho-vidomo-pro-bronovik-wolfhound_n2167966',
'title': 'Броньовик Wolfhound: гігант, який допомагає ЗСУ знищувати окупантів на фронті',
'thumbnail': r're:^https?://.*\.jpe?g',
'timestamp': float,
'upload_date': str,
'description': 'md5:6816e1e5a65304bd7898e4c7eb1b26f7',
'age_limit': 0,
},
'playlist_count': 2
}, {
# Generic iframe embed of TV24UAPlayerIE within data-html
'url': 'https://24tv.ua/harkivyani-zgaduyut-misto-do-viyni-shhemlive-video_n1887584',
'info_dict': {
'id': '1887584',
'ext': 'mp4',
'title': 'Харків\'яни згадують місто до війни: щемливе відео',
'thumbnail': r're:^https?://.*\.jpe?g',
},
'params': {'skip_download': True}
}, {
# YouTube embeds on Squarespace (data-html): https://github.com/ytdl-org/youtube-dl/issues/21294
'url': 'https://www.harvardballetcompany.org/past-productions',
'info_dict': {
'id': 'past-productions',
'title': 'Productions — Harvard Ballet Company',
'age_limit': 0,
'description': 'Past Productions',
},
'playlist_mincount': 26
}, {
# Squarespace video embed, 2019-08-28, data-html
'url': 'http://ootboxford.com',
'info_dict': {
'id': 'Tc7b_JGdZfw',
'title': 'Out of the Blue, at Childish Things 10',
'ext': 'mp4',
'description': 'md5:a83d0026666cf5ee970f8bd1cfd69c7f',
'uploader_id': 'helendouglashouse',
'uploader': 'Helen & Douglas House',
'upload_date': '20140328',
'availability': 'public',
'view_count': int,
'channel': 'Helen & Douglas House',
'comment_count': int,
'uploader_url': 'http://www.youtube.com/user/helendouglashouse',
'duration': 253,
'channel_url': 'https://www.youtube.com/channel/UCTChGezrZVmlYlpMlkmulPA',
'playable_in_embed': True,
'age_limit': 0,
'channel_follower_count': int,
'channel_id': 'UCTChGezrZVmlYlpMlkmulPA',
'tags': 'count:6',
'categories': ['Nonprofits & Activism'],
'like_count': int,
'thumbnail': 'https://i.ytimg.com/vi/Tc7b_JGdZfw/hqdefault.jpg',
},
'params': {
'skip_download': True,
},
}]
def _extract_from_webpage(self, url, webpage):
combined = ''
for _, html in re.findall(r'(?s)\bdata-html=(["\'])((?:(?!\1).)+)\1', webpage):
# unescapeHTML can handle &quot; etc., unquote can handle percent encoding
unquoted_html = unescapeHTML(urllib.parse.unquote(html))
if unquoted_html != html:
combined += unquoted_html
if combined:
yield from self._extract_generic_embeds(url, combined)

Some files were not shown because too many files have changed in this diff Show More