Page 3 of 3 FirstFirst 123
Results 31 to 36 of 36
  1. #31

    Default

    I wanted to point out one difficulty I found when archiving some threads. The archive format does not preserve the bbcode, specifically quotes. I'm not sure how the threads you are referencing are formatted, but if there are many quotes it becomes nearly impossible to read. The quoted's sentences run right into the quoter's, etc. I couldn't tell where my sentences began and the other guy's ended.

    I did have luck saving threads as PDF, though. I found a handy Firefox Addon, the cleverly named "Print pages to Pdf" , that lets you print every open tab into a PDF document. It may be a tad unwieldy, but you can open a tab for each page and then use the addon. I archived a thread this way with 10 pages per PDF and compressed a thread into 8 pages, but there's no reason you couldn't do it in a single pdf.

  2. #32
    Sailing the seas Chris Lang's Avatar
    Join Date
    Apr 2014
    Posts
    3,479

    Default

    Quote Originally Posted by impulseucf View Post
    I wanted to point out one difficulty I found when archiving some threads. The archive format does not preserve the bbcode, specifically quotes. I'm not sure how the threads you are referencing are formatted, but if there are many quotes it becomes nearly impossible to read. The quoted's sentences run right into the quoter's, etc. I couldn't tell where my sentences began and the other guy's ended.

    I did have luck saving threads as PDF, though. I found a handy Firefox Addon, the cleverly named "Print pages to Pdf" , that lets you print every open tab into a PDF document. It may be a tad unwieldy, but you can open a tab for each page and then use the addon. I archived a thread this way with 10 pages per PDF and compressed a thread into 8 pages, but there's no reason you couldn't do it in a single pdf.
    I've been saving threads by going into the Print View and going to 40 pages. There, you get a much easier to read format where all the tags are intact. From there you can either save as HTML or as text.

  3. #33
    Incredible Member Anodyne's Avatar
    Join Date
    May 2014
    Posts
    531

    Default

    Quote Originally Posted by Brandon Hanvey View Post
    The link is near the bottom right of the forums.

    Old forums
    http://oldforums.comicbookresources....hive/index.php

    New Community
    http://community.comicbookresources....hive/index.php
    Thanks, Brandon.
    Beverly Allen, the Bee--with honey and stinger.

    "If humans have souls, then clones will have them, too."--Arthur Caplan

  4. #34
    BANNED
    Join Date
    Apr 2014
    Location
    Atlantean Embassy
    Posts
    1,680

    Default

    Quote Originally Posted by yet another View Post
    Some tooling might help.

    This one for example: http://www.httrack.com/

    You should probably be careful with it so you don't accidentally download the whole forum (or more!), but only select threads.
    I tried this software, and it took over 7 hours and copied the first five pages of threads in the Avengers forum. HOW do I get to copy only the thread I specify?


    I've also tried the very onerous task of saving a thread page by page.

    Has anyone tried the Firefox Scrapbook add on? I think someone mentioned it elsewhere.

  5. #35

    Default

    Quote Originally Posted by Rheged View Post
    I tried this software, and it took over 7 hours and copied the first five pages of threads in the Avengers forum. HOW do I get to copy only the thread I specify?
    Yeah, I found it way too difficult. After 24 hours and 80 gigs I still didn't capture the thread I wanted. I'm sure it was user limitations and there is way to filter out the data I don't want, but I don't know what that is.
    I've also tried the very onerous task of saving a thread page by page.

    Has anyone tried the Firefox Scrapbook add on? I think someone mentioned it elsewhere.
    I didnt use the Firefox Scrapbook, but I had luck with a Firefox Addon called "Print pages to Pdf" that lets you print every open tab into a PDF document. Just go to your thread, open each page in a new tab, and it combines them. I archived a thread this way with 10 pages per PDF and compressed a thread into 8 pages, but there's no reason you couldn't do it in a single pdf. Also, if you haven't (not sure if you can at this point) go to your settings and set it to show 40 posts per page.

  6. #36
    bye thx fish yet another's Avatar
    Join Date
    May 2014
    Location
    Undisclosed location
    Posts
    1,731

    Default

    Quote Originally Posted by Rheged View Post
    I tried this software, and it took over 7 hours and copied the first five pages of threads in the Avengers forum. HOW do I get to copy only the thread I specify?
    Ah, sorry about that, should probably have tested myself before giving out the link.

    By default HttpTrack will download the whole website, and specifying the start address of a single thread in the "Web Addresses: (URL)" field does not stop that. It will just follow all the links from the thread start page out to the rest of the site. And so on recursively from the links in those pages.

    You can restrict it to a single thread with A LITTLE extra work though. Say you want to save the thread with ID 56361.

    1. First specify the thread start address in "Web Addresses". Something like:

    Code:
    http://oldforums.comicbookresources.com/showthread.php?56361
    2. Then click "Set options..." to open the options dialog. In that, click the "Scan Rules" tab. Once there, edit the white textbox to include (only) the following three lines:

    Code:
    -*
    +oldforums.comicbookresources.com/showthread.php?56361*page*
    +*.gif +*.jpg +*.png +*.js +*.css
    The first one skips everything, the second one enables only the URLs for the threads pages (* is wildcard) and the last one enables downloading of images (gif,jpg,png) as well as Javascript and CSS files used by the thread pages.

    Note: Seems user avatars are not referenced by any image extensions so the filter above will not include them in the download. If you want them then this line can be added to the three filter lines:

    Code:
    +oldforums.comicbookresources.com/image.php*

    I just tested this with a random thread and managed to download only the required files.
    Last edited by yet another; 05-09-2014 at 02:52 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •