Commit Graph

18390 Commits

Author SHA1 Message Date
Unknown
ad42fc2048 Merge remote-tracking branch 'origin/master' 2020-09-23 03:52:35 +02:00
Unknown
afe4cdcf58 [skip travis] very minor but important workflow related issue 2020-09-23 03:52:26 +02:00
Tom-Oliver Heidel
eb98353bdd
[skip travis] adjust available python version 2020-09-23 03:51:29 +02:00
Unknown
f940c3172a add missing future import 2020-09-23 03:35:14 +02:00
Unknown
cdb7547e14 add pyinst to test exceptions 2020-09-23 03:30:33 +02:00
Unknown
7e8772bf82 Merge remote-tracking branch 'origin/master' 2020-09-23 03:22:45 +02:00
Tom-Oliver Heidel
f97123d28c
[skip travis] added two spaces 2020-09-23 03:21:28 +02:00
Tom-Oliver Heidel
5bf3bb22d6
[skip travis] new workflow 2020-09-23 03:19:38 +02:00
Unknown
0fcd0fbb8c Merge remote-tracking branch 'origin/master' 2020-09-23 03:19:03 +02:00
Tom-Oliver Heidel
6c4e8b23e6
[skip travis] disable old workflow 2020-09-23 03:18:44 +02:00
Unknown
915f2a92ac update workflow, semi fix integrated updater 2020-09-23 03:16:06 +02:00
xarantolus
c0a1a8926d Use better regex for all fixed extraction types 2020-09-22 20:52:52 +02:00
Unknown
b137e533ee [skip travis] updating issue template tmpls 2020-09-22 18:53:31 +02:00
Unknown
11f96ac427 Merge branch 'ytdl-org-master' 2020-09-22 16:24:06 +02:00
Unknown
1b3f7c9a7e merge youtube-dl master 22.09.2020 2020-09-22 16:09:54 +02:00
Sergey M․
c5764b3f89
[downloader/http] Properly handle missing message in SSLError (closes #26646) 2020-09-22 07:01:59 +07:00
Sergey M․
0837992a22
[downloader/http] Fix access to not yet opened stream in retry 2020-09-22 06:44:14 +07:00
Joel Potts
b84071c0a9 [youtube] Added 'subscriber_count' to extraction 2020-09-21 11:56:51 +02:00
Tom-Oliver Heidel
486ad2cd50
Merge pull request #129 from jbruchon/master
Switch from binary search tree to Python sets
2020-09-20 12:14:03 +02:00
Sergey M․
b55715934b
release 2020.09.20 2020-09-20 12:30:45 +07:00
Sergey M․
bbc3b5b4bb
[ChangeLog] Actualize
[ci skip]
2020-09-20 12:24:32 +07:00
nixxo
1ca5f821c8
[redtube] Extend _VALID_URL (#26506) 2020-09-20 11:39:42 +07:00
Sergey M․
defc820b70
[twitch] Switch streams to GraphQL and refactor (closes #26535) 2020-09-20 10:05:00 +07:00
Jody Bruchon
a45e861918 Switch from binary search tree to Python sets
Signed-off-by: Jody Bruchon <jody@jodybruchon.com>
2020-09-18 21:18:23 -04:00
Sergey M․
82ef02e936
[telequebec] Fix issues (closes #26368) 2020-09-19 07:56:00 +07:00
Patrick Dessalle
b856b3997c
[telequebec] Add support for brightcove videos (closes #25833) 2020-09-19 07:52:57 +07:00
Sergey M․
cd85a1bb8b
[pornhub] Extract metadata from JSON-LD (closes #26614) 2020-09-19 06:34:34 +07:00
Sergey M․
ce5b904050
[extractor/common] Relax interaction count extraction in _json_ld 2020-09-19 06:33:17 +07:00
Sergey M․
ad06b99dd4
[extractor/common] Extract author as uploader for VideoObject in _json_ld 2020-09-19 06:13:42 +07:00
JChris246
540b9f5164
[pornhub] Fix view count extraction (#26621) (refs #26614) 2020-09-19 05:59:19 +07:00
Jody Bruchon
fd87f42378 Randomize the ArchiveTree the proper Python way
Signed-off-by: Jody Bruchon <jody@jodybruchon.com>
2020-09-18 14:22:42 -04:00
Tom-Oliver Heidel
53d50142e8 [skip travis] Update issue templates 2020-09-18 16:22:24 +02:00
Tom-Oliver Heidel
c71700dbe4
Merge pull request #125 from jbruchon/master
Keep download archive in memory for better performance
2020-09-18 15:59:31 +02:00
Jody Bruchon
2459b6e1cf Style revisions 2020-09-18 09:35:21 -04:00
Jody Bruchon
4f0150dcec Merge remote-tracking branch 'upstream/master' 2020-09-18 08:49:11 -04:00
Unknown
35d3b674c7 [hotstar] regex the second. 2020-09-18 14:15:34 +02:00
Jody Bruchon
a4d834fb3e Fix wrong variable in position swap corrupting archive list
It's always a simple error in the end, you know?

Signed-off-by: Jody Bruchon <jody@jodybruchon.com>
2020-09-18 00:11:36 -04:00
Jody Bruchon
fda63a4e87 Randomize archive order before populating search tree
This doesn't result in an elegant, perfectly balanced search tree,
but it's absolutely good enough. This commit completely mitigates
the worst-case scenario where the archive file is sorted.

Signed-off-by: Jody Bruchon <jody@jodybruchon.com>
2020-09-17 21:45:40 -04:00
Stefan Pöschel
6e65a2a67e
[downloader/hls] Fix incorrect end byte in Range HTTP header for media segments with EXT-X-BYTERANGE (#24512) (closes #14748)
The end of the byte range is the first byte that is NOT part of the to
be downloaded range. So don't include it into the requested HTTP
download range, as this additional byte leads to a broken TS packet and
subsequently to e.g. visible video corruption.

Fixes #14748.
2020-09-18 05:26:56 +07:00
Jody Bruchon
1d74d8d9f6 Try to mitigate the problem of loading a fully sorted archive
Sorted archives turn the binary tree into a linked list and make
things horribly slow. This is an incomplete mitigation for this
issue.
2020-09-17 17:28:22 -04:00
Sergey M․
f8c7bed133
[extractor/common] Handle ssl.CertificateError in _request_webpage (closes #26601)
ssl.CertificateError is raised on some python versions <= 3.7.x
2020-09-18 03:41:16 +07:00
Sergey M․
cdc55e666f
[downloader/http] Improve timeout detection when reading block of data (refs #10935) 2020-09-18 03:32:54 +07:00
Ori Avtalion
86b7c00adc
[downloader/http] Retry download when urlopen times out (#26603) (refs #10935) 2020-09-18 03:15:44 +07:00
Jody Bruchon
1de7ea76f8 Remove recursion in at_insert() 2020-09-17 15:08:33 -04:00
Jody Bruchon
a5029645ae Remove debugging print statements 2020-09-17 14:46:11 -04:00
Jody Bruchon
ecdec1913f Keep download archive in memory for better performance
The old behavior was to open and scan the entire archive file for
every single video download. This resulted in horrible performance
for archives of any remotely large size, especially since all new
video IDs are appended to the end of the archive. For anyone who
uses the archive feature to maintain archives of entire video
playlists or channels, this meant that all such lists with newer
downloads would have to scan close to the end of the archive file
before the potential download was rejected. For archives with tens
of thousands of lines, this easily resulted in millions of line
reads and checks over the course of scanning a single channel or
playlist that had been seen previously.

The new behavior in this commit is to preload the archive file
into a binary search tree and scan the tree instead of constantly
scanning the file on disk for every file. When a new download is
appended to the archive file, it is also added to this tree. The
performance is massively better using this strategy over the more
"naive" line-by-line archive file parsing strategy.

The only negative consequence of this change is that the archive
in memory will not be synchronized with the archive file on disk.
Running multiple instances of the program at the same time that
all use the same archive file may result in duplicate archive
entries or duplicated downloads. This is unlikely to be a serious
issue for the vast majority of users. If the instances are not
likely to try to download identical video IDs then this should
not be a problem anyway; for example, having two instances pull
two completely different YouTube channels at once should be fine.

Signed-off-by: Jody Bruchon <jody@jodybruchon.com>
2020-09-17 14:22:07 -04:00
SeonjaeHyeon
217e517384
[naver] Add support for live videos 2020-09-17 22:14:30 +09:00
Unknown
7ac0ba50ce [hotstar] regex fix 2020-09-17 14:00:03 +02:00
Unknown
fe84e2a391 [skip travis] winver 2020-09-16 14:22:51 +02:00
Unknown
17cb02d0c6 bump version 2020.09.16 2020-09-16 13:55:35 +02:00