Quora

From Archiveteam
Jump to navigation Jump to search
Quora
Quora screenshot 2014-10-07
Quora screenshot 2014-10-07
URL https://quora.com/
Status Online!
Archiving status Not saved yet
Archiving type Unknown
IRC channel #archiveteam-bs (on hackint)

Quora is a question and answer site renowned for its thoughtful, in-depth contributions and famous contributors. It is most famous for answers from Silicon Valley tech company founders and CEOs, but it covers virtually any topic, including life advice. The answers from tech celebrities can be unique statements on matters of history, not available anywhere else. And the thoughtful answers can be pretty valuable as well.

Vital signs

Going strong! However (as with the case of YouTube and other social media sites with user-owned content), a number of users delete their answers, and some delete their entire accounts.

Self-archival

Concerns

Quora specifically disallows Internet Archive's crawlers, so there is no permanent record of any of their content. From their robots.txt:

# People share a lot of sensitive material on Quora - controversial political
# views, workplace gossip and compensation, and negative opinions held of
# companies. Over many years, as they change jobs or change their views, it is
# important that they can delete or anonymize their previously-written answers.
# 
# We opt out of the wayback machine because inclusion would allow people to
# discover the identity of authors who had written sensitive answers publicly and
# later had made them anonymous, and because it would prevent authors from being
# able to remove their content from the internet if they change their mind about
# publishing it. As far as we can tell, there is no way for sites to selectively
# programmatically remove content from the archive and so this is the only way
# for us to protect writers. If they open up an API where we can remove content
# from the archive when authors remove it from Quora, but leave the rest of the
# content archived, we would be happy to opt back in. See the page here:
# 
# https://archive.org/about/exclude.php
# 
# Meanwhile, if you are looking for an older version of any content on Quora, we
# have full edit history tracked and accessible in product (with the exception of
# content that has been removed by the author). You can generally access this by
# clicking on timestamps, or by appending "/log" to the URL of any content page.
# 
# For any questions or feedback about this please email robotstxt@quora.com.

It is a pretty typical web startup, steeped in Valley culture, so we can pretty much rely on it going belly-up or "pivoting" in a few years and deleting everyone's content. See Quora Keeps the World's Knowledge For Itself, October 2014. Another essay from December 2018 recommends against using Quora at all due to both their exclusion from the Wayback Machine, lack of public API, and their hostility against manual web scraping efforts.

It should be noted that their site (both desktop and mobile front ends) uses AJAX-based loading as of 2021, meaning scraping is only possible through resource-heavy headless page rendering, not simple HTML-based downloading.

Quora also has a history of making brash decisions; e.g. they got rid of private blogs and gave users a one-month notice to save them or make them public.

As of 2020, Quora is also known to host serial plagiarism from other more reputable websites such as StackExchange. Users have a financial incentive to perform such plagiarism if they are "Quora partners", a category of users who have no limits to posting and can share revenue from ads on their posts. Such copyright issues might be one reason Quora doesn't want their content to be looked at too closely.

Action items

Quora is amazing in that it's a smarted up version of Yahoo! Answers and Facebook: less rubbish and slightly better terms of use. As long as it "eats" them (only in English?) and nothing else, it's an advancement; but to be a real force for good it would need several other things.

  • FUD on the Internet Archive should stop, someone point folks to the IA FAQ for content removal? There should be some pressure for them to allow archiving.
  • It's not free/open culture.[1] Little to do here.
  • No data dumps anywhere, nor export functions? (Particularly bad when they don't allow wayback.) There are probably users every now and then who want to export their precious "content", there should be a script for them to export their stuff. Can always turn useful when Quora needs to be panic-grabbed.