https://wiki.archiveteam.org/api.php?action=feedcontributions&user=Vrmlguy&feedformat=atomArchiveteam - User contributions [en]2024-03-29T08:19:20ZUser contributionsMediaWiki 1.37.1https://wiki.archiveteam.org/index.php?title=URLTeam&diff=486URLTeam2009-04-11T21:19:07Z<p>Vrmlguy: /* Too many people using TinyURL and similar services */</p>
<hr />
<div>=== Too many people using TinyURL and similar services ===<br />
<br />
Twitter is a great example of what's wrong with trusting an online service with something of value. Check out some 'tweets':<br />
<br />
* Hah, I'm a Zombie! http://tinyurl.com/8gnnb7 Ahh, the fun we all have with each other. about 1 hour ago from web<br />
* Health privacy is dead. Here's why: http://ff.im/GMpx about 14 hours ago from FriendFeed<br />
* Hmm, friendfeed released a new "import Twitter" feature today. It is taking a LONG time on my account. I wonder why.... http://ff.im/GM5W about 14 hours ago from FriendFeed<br />
<br />
If these TinyURL services go away, there's not much content here.<br />
<br />
So, the project, scrape the TinyURL (and similar) services. It's actually not as hard as it sounds, because we don't need to scrape any web pages or parse any html, since the services just send a Location: header when queried for the hash, we just ask the service for the hash and parse the headers for the redirect url:<br />
<br />
(18) swebb@swebb.cluster Wed 11:10am [~] % curl -LLIs http://tinyurl.com/6dvm2t | grep Location <br />
Location: http://www.readwriteweb.com/archives/too_many_people_use_tinyurl.php<br />
(19) swebb@swebb.cluster Wed 11:10am [~] % curl -LLIs http://ff.im/GMpx | grep Location<br />
Location: http://friendfeed.com/e/08954685-00fe-4e55-b28f-4b99f83bfb0d/Health-privacy-is-dead-Here-s-why/<br />
<br />
Walk through all possible hash tags, check for errors, and we're good-to-go.<br />
<br />
STATUS: Crawling tinyurl.com and ff.im as a first test at an acceptable rate so I won't get my IP banned.<br />
<br />
* This lloks like it would be a good task for distributed computing. [http://www.majestic12.co.uk/ Majestic-12] is a project whose main bottleneck is bandwidth, and they are doing quite well. You'd just need to give people a block of URLs to check, and have them report back the results.<br />
<br />
Sites that I've collected that offer similar services:<br />
<br />
1link.in<br />
4url.cc<br />
6url.com<br />
adjix.com<br />
ad.vu<br />
bellypath.com<br />
bit.ly<br />
bkite.com<br />
budurl.com<br />
canurl.com<br />
cli.gs<br />
decenturl.com<br />
dn.vc<br />
doiop.com<br />
dwarfurl.com<br />
easyuri.com<br />
easyurl.net<br />
ff.im<br />
go2cut.com<br />
gonext.org<br />
hulu.com<br />
hypem.com<br />
ifood.tv<br />
ilix.in<br />
is.gd<br />
ix.it<br />
jijr.com<br />
kissa.be<br />
kurl.us<br />
litturl.com<br />
lnkurl.com<br />
memurl.com<br />
metamark.net<br />
miklos.dk<br />
minilien.com<br />
minurl.org<br />
muhlink.com<br />
myurl.in<br />
myurl.us<br />
notlong.com<br />
ow.ly<br />
plexp.com<br />
poprl.com<br />
qurlyq.com<br />
redirx.com<br />
s3nt.com<br />
shorterlink.com<br />
shortlinks.co.uk<br />
short.to<br />
shorturl.com<br />
shrinklink.co.uk<br />
shrinkurl.us<br />
shrt.st<br />
shurl.net<br />
simurl.com<br />
shorl.com<br />
smarturl.eu<br />
snipr.com<br />
snipurl.com<br />
snurl.com<br />
sn.vc<br />
starturl.com<br />
surl.co.uk<br />
tighturl.com<br />
timesurl.at<br />
tiny123.com<br />
tiny.cc<br />
tinylink.com<br />
tinyurl.com<br />
tobtr.com<br />
traceurl.com<br />
tr.im<br />
tweetburner.com<br />
twitpwr.com<br />
twitthis.com<br />
twurl.nl<br />
u.mavrev.com<br />
ur1.ca<br />
url9.com<br />
urlborg.com<br />
urlbrief.com<br />
urlcover.com<br />
urlcut.com<br />
urlhawk.com<br />
url-press.com<br />
urlsmash.com<br />
urltea.com<br />
urlvi.be<br />
vimeo.com<br />
wlink.us<br />
xaddr.com<br />
xil.in<br />
xrl.us<br />
x.se<br />
xs.md<br />
yatuc.com<br />
yep.it<br />
yweb.com<br />
zi.ma<br />
w3t.org<br />
<br />
[[Category: URL Shortening]]</div>Vrmlguy