本文提供根据URL批量下载图片的Perl脚本,能够打破防盗链设置。
至于如何获得图片的URL,一般的浏览器都支持右键点击图片“复制图片链接”,有的网站的图片URL有规律,这就最理想了,用excel扩展出n个URL即可!
虽然Firefox可以在图片上点击右键“查看图片信息”显示该页面所有图片的地址,并能够批量”另存为”,但如果遇到某些图片链接,比如QQ空间上的图片的链接,它不是以图片后缀名结尾的,此方法会把所有图片保存为同一个文件名的文件,即覆盖先前文件,最终只得到一个文件。此脚本通过计数将图片保存为不重复的文件。
第一次运行batch_download_images.pl,它会自动生成一个默认的URL文件,如下所示:
# 设置referer是为了反“防盗链”,只需设置该网站任一页面的地址即可
referer = http://www.wzsky.net/html/Website/Color/117958_11.html
# urls
http://www.wzsky.net/img2013/uploadimg/20130906/12162977.jpg
http://www.wzsky.net/img2013/uploadimg/20130906/12162978.jpg
http://www.wzsky.net/img2013/uploadimg/20130906/12162979.jpg
代码如下:
#!/usr/bin/perl
# Name : Batch download images by url file.
# Author : Wei Shen
# Contact : shenwei356#gmail.com
# Site : http://shenwei.me
# Date : 2013-10-22
# Update : 2013-10-22
use strict;
use LWP::UserAgent;
unless ( @ARGV >= 1 ) {
&create_sample_urls_file() unless -e "urls.txt";
die "\nUsage: $0 <URLs File> [<URLs File> ...]\n\n";
}
my $s = "file_";
my $n = 0;
my ( $file, $url, $f );
my $browser = LWP::UserAgent->new;
my $response;
my $referer;
my ( $para, $urls );
while ( @ARGV > 0 ) {
$file = shift @ARGV;
( $para, $urls ) = &read_urls($file);
$referer = $$para{referer};
for $url (@$urls) {
if ( $url =~ /\/([^\/]+?\.[\w]+?)$/ ) {
$f = $1;
}
else {
$f = "$s$n.jpg";
$n++;
}
$response = $browser->get( $url, Referer => $referer );
open( OUT, ">$f" ) || die $!;
binmode(OUT);
print OUT $response->content;
close(OUT);
print $f, "\n";
}
}
sub read_urls ($) {
my ($file) = @_;
my $para = {};
my $urls = [];
open IN, $file or die "File $file failed to open.\n";
while (<IN>) {
s/^\s+//g;
s/\s+$//g;
next if $_ eq '' # blank line
or /^#/; # annotation
s/\#.*//g; # delete annotation
if (/([\w\_]+)\s*=\s*(.+)/) {
warn "$1 was defined more than once\n" if defined $$para{$1};
$$para{$1} = $2;
warn "value of $1 undefined!\n" if $2 eq '';
}
else {
s/\r?\n//;
push @$urls, $_;
}
}
close IN;
return ( $para, $urls );
}
# Create sample urls file
sub create_sample_urls_file {
my $content = <<"URL";
# 设置referer是为了反“防盗链”,只需设置该网站任一页面的地址即可
referer = http://www.wzsky.net/html/Website/Color/117958_11.html
# urls
http://www.wzsky.net/img2013/uploadimg/20130906/12162977.jpg
http://www.wzsky.net/img2013/uploadimg/20130906/12162978.jpg
http://www.wzsky.net/img2013/uploadimg/20130906/12162979.jpg
URL
open OUT, ">", "urls.txt"
or die "Failed to create default url file\n";
print OUT $content;
close OUT;
}
此脚本仅供交流,请尊重版权,不要盗用别人网站上的图片资源。