Commit Graph

72 Commits

Author SHA1 Message Date
Philipp Hagemeister b60016e831 Deal with implicitly UTF-16 decoded webpages
These webpages don't specify an encoding and rely on the BOM
2014-01-21 01:39:40 +01:00
Philipp Hagemeister 3ec05685f7 [extractor/common] Limit --write-pages filename to 200 chars
This avoids problems with very long URLs.
2014-01-17 14:47:47 +01:00
Philipp Hagemeister 9933b57430 [pornhub] Use centralized sorting 2014-01-07 10:25:34 +01:00
Philipp Hagemeister 3d3538e422 [khanacademy] Add support (Fixes #2066) 2014-01-07 09:35:34 +01:00
Philipp Hagemeister 5d73273f6f [orf] Use new extraction method (Fixes #2057) 2014-01-06 17:15:27 +01:00
Philipp Hagemeister 9887c9b2d6 [jpopsuki] Simplify 2014-01-03 12:51:37 +01:00
Philipp Hagemeister 08d13955dd [wistia] Prefer original video format above all others
We could also set up a formula which would weigh filesize/bitrate and vcodec/acodec (say, 1GB h264 < 3 GB MPEG2 < 2 GB h264), but that would get really messy real soon.
2014-01-01 20:23:49 +01:00
Philipp Hagemeister 5d4f3985be Document that format_id field should be present 2013-12-26 21:19:00 +01:00
Philipp Hagemeister 7217e148fb [yahoo] Use centralized sorting, and add tbr field 2013-12-25 15:18:40 +01:00
Philipp Hagemeister c7deaa4c74 [zdf] Use centralized sorting 2013-12-24 23:32:04 +01:00
Philipp Hagemeister e6812ac99d [spiegel] Use centralized sorting 2013-12-24 12:40:23 +01:00
Philipp Hagemeister 4bcc7bd1f2 Add temporary _sort_formats helper function 2013-12-24 12:31:42 +01:00
Philipp Hagemeister f49d89ee04 Add a resolution field and improve general --list-formats output 2013-12-24 11:56:02 +01:00
Philipp Hagemeister f45f96f8f8 [myvideo] Use RTMP instead of RTMPT (Fixes #2032) 2013-12-23 15:57:43 +01:00
Philipp Hagemeister 1538eff6d8 [bliptv] Remove support for direct downloads
This is now handled by the generic IE
2013-12-23 15:49:21 +01:00
Philipp Hagemeister aa94a6d315 [aparat] Add support (Fixes #2012) 2013-12-20 17:05:39 +01:00
Jaime Marquínez Ferrándiz c0d0b01f0e [generic] Detect ooyala videos (fixes #2013) 2013-12-19 20:32:12 +01:00
Philipp Hagemeister 46374a56b2 [youtube] Do not warn for videos with allow_rating=0
This fixes #1982
Test video: http://www.youtube.com/watch?v=gi2uH3YxohU
2013-12-17 02:49:56 +01:00
Itay Brandes 87a28127d2 _search_regex's "isatty" call fails with Py2exe's
_search_regex calls the sys.stderr.isatty() function for unix systems.

Py2exe uses a custom Stderr() stream which doesn't have an `isatty()`
function, leading to it's crash.

Fixes easily with checking that it's a unix system first.
2013-12-16 21:50:26 +01:00
Philipp Hagemeister d67b0b1596 Reorder info_dict documentation 2013-12-16 14:13:40 +01:00
Philipp Hagemeister c0ba0f4859 Document duration field 2013-12-16 04:09:43 +01:00
Philipp Hagemeister e2b38da931 [mtv] Fixup incorrectly encoded XML documents 2013-12-10 12:45:22 +01:00
Philipp Hagemeister 7cc3570e53 Add fatal=False parameter to _download_* functions.
This allows us to simplify the calls in the youtube extractor even further.
2013-12-09 01:49:03 +01:00
Philipp Hagemeister 19e3dfc9f8 [9gag] Like/dislike count (#1895) 2013-12-05 18:29:07 +01:00
Philipp Hagemeister aaebed13a8 [smotri] Simplify 2013-12-02 17:08:17 +01:00
Philipp Hagemeister 2a275ab007 [zdf] Use _download_xml 2013-11-28 05:47:50 +01:00
Philipp Hagemeister 79d09f47c2 Merge branch 'opener-to-ydl' 2013-11-25 03:30:37 +01:00
Philipp Hagemeister c059bdd432 Remove quality_name field and improve zdf extractor 2013-11-25 03:28:55 +01:00
Philipp Hagemeister 02dbf93f0e [zdf/common] Use API in ZDF extractor.
This also comes with a lot of extra format fields
Fixes #1518
2013-11-25 03:13:22 +01:00
Philipp Hagemeister e03db0a077 Merge branch 'master' into opener-to-ydl 2013-11-24 15:18:44 +01:00
Jaime Marquínez Ferrándiz 267ed0c5d3 [collegehumor] Encode the xml before calling xml.etree.ElementTree.fromstring (fixes #1822)
Uses a new helper method in InfoExtractor: _download_xml
2013-11-24 14:59:19 +01:00
Philipp Hagemeister 7012b23c94 Match --download-archive during playlist processing (Fixes #1745) 2013-11-22 22:46:46 +01:00
Philipp Hagemeister dca0872056 Move the opener to the YoutubeDL object.
This is the first step towards being able to just import youtube_dl and start using it.
Apart from removing global state, this would fix problems like #1805.
2013-11-22 19:57:52 +01:00
Philipp Hagemeister 5904088811 Add support for tou.tv (Fixes #1792) 2013-11-20 06:13:19 +01:00
Philipp Hagemeister 91c7271aab Add automatic generation of format note based on bitrate and codecs 2013-11-16 01:08:43 +01:00
Jaime Marquínez Ferrándiz 78fb87b283 Don't accept '>' inside the content attribute in OpenGraph regexes 2013-11-15 12:54:13 +01:00
Jaime Marquínez Ferrándiz ab2d524780 Improve the OpenGraph regex
* Do not accept '>' between the property and content attributes.
* Recognize the properties if the content attribute is before the property attribute using two regexes (fixes the extraction of the description for SlideshareIE).
2013-11-15 12:24:54 +01:00
Philipp Hagemeister eb0a839866 [common] Simplify og_search_property 2013-11-12 10:36:23 +01:00
Marcin Cieślak a8eeb0597b Fix AssertionError when og property not found
On tvp.pl some webpages contain OpenGraph
metadata and some don't.

If og property is not found, _og_search_description
fails with

WARNING: unable to extract OpenGraph description; please report this issue on http://yt-dl.org/bug
Traceback (most recent call last):
  File "/usr/home/saper/bin/youtube-dl", line 18, in <module>
    youtube_dl.main()
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/__init__.py", line 766, in main
    _real_main(argv)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/__init__.py", line 719, in _real_main
    retcode = ydl.download(all_urls)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/YoutubeDL.py", line 715, in download
    videos = self.extract_info(url)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/YoutubeDL.py", line 348, in extract_info
    ie_result = ie.extract(url)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 125, in extract
    return self._real_extract(url)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/tvp.py", line 56, in _real_extract
    info['description'] = self._og_search_description(webpage)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 331, in _og_search_description
    return self._og_search_property('description', html, fatal=False, **kargs)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 325, in _og_search_property
    return unescapeHTML(escaped)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/utils.py", line 494, in unescapeHTML
    assert type(s) == type(u'')
AssertionError

The patch allows me to use:

  try:
    info['description'] = self._og_search_description(webpage)
    info['thumbnail'] = self._og_search_thumbnail(webpage)
  except RegexNotFoundError:
    pass
2013-11-05 23:19:29 +01:00
Jaime Marquínez Ferrándiz 9103bbc5cd Add the 'webpage_url' field to info_dict
The url for the video page, it must allow to reproduce the result.
It's automatically set by YoutubeDL if it's missing.
2013-11-03 12:11:13 +01:00
Philipp Hagemeister b5d0d817bc Remove superfluous space 2013-10-30 01:09:44 +01:00
Philipp Hagemeister ebc14f251c Merge remote-tracking branch 'origin/master' 2013-10-28 10:44:13 +01:00
Philipp Hagemeister d41e6efc85 New debug option --write-pages 2013-10-28 10:44:02 +01:00
Filippo Valsorda 8ffa13e03e [Instagram] get the non-https link, as they are serving Akamai cert from a instagram.com domain 2013-10-28 02:34:29 -04:00
Jaime Marquínez Ferrándiz 55b3e45bba [vimeo] Fix pro videos and player.vimeo.com urls
The old process can still be used for those videos.
Added RegexNotFoundError, which is raised by _search_regex if it can't extract the info.
2013-10-23 14:38:03 +02:00
Jaime Marquínez Ferrándiz 8c51aa6506 The 'format' field now defaults to '{format_id} - {width}x{height}{format_note}'
Following the YoutubeIE format. The 'format_note' gives additional info about the format, for example '3D' or 'DASH video'.
2013-10-21 14:42:06 +02:00
Philipp Hagemeister 416a5efce7 fix typos 2013-10-18 00:49:45 +02:00
Philipp Hagemeister 8dbe9899a9 Allow users to specify an age limit (fixes #1545)
With these changes, users can now restrict what videos are downloaded by the intented audience, by specifying their age with --age-limit YEARS .
Add rudimentary support in youtube, pornotube, and youporn.
2013-10-06 06:08:56 +02:00
Philipp Hagemeister 2f5865cc6d Clarify that url and ext are optional when formats is given (#980) 2013-10-04 11:09:43 +02:00
Philipp Hagemeister deefc05b88 Document formats (for #980) 2013-10-04 10:40:42 +02:00