Has anyone used ArchiveBox for self hosted web archiving? If so, what are your thoughts on it compared to Internet Archive or other publicly available services?

  • @hoodlem@hoodlem.me
    link
    fedilink
    English
    31 year ago

    I used it but unfortunately it did not meet my needs. I’m interested in a full mirror of a website, while ArchiveBox focuses on a single webpage with a max of 1 level deep. I use wget personally, but if your goal is to archive a single webpage then ArchiveBox might be a good choice.

  • @ThorrJo
    link
    English
    21 year ago

    I have been experimenting with it, for what it is, it works pretty well … for now. I have concerns about the fact that it’s a ton of moving parts basically duct-taped together by an abuse of the Django admin (that’s the web app platform it’s based on, which I was a developer for long ago). Also, the search function is primitive at best. I don’t think it’s going to be my long-term solution for this need, but maybe I’m wrong.

    • @oldfart@lemm.ee
      link
      fedilink
      English
      11 year ago

      The archived pages are available as files on disk, I also added a script which generates index.html so I can browse it without starting the program. Basically the only time I run archivebox code is when adding a new site. And I never look at the GUI, it adds nothing to the table

  • BustedPancake
    link
    fedilink
    English
    11 year ago

    It’s a great tool, but depends on what you expect from it and your use case. Personally I tried it but was always disappointed by it. I always just end up using SingleFile(Z) on my browser or in the cli along with the usual yt-dlp and the like and that’s all I need really. And if I need to save an entire site I just use wget or httrack. I don’t really have the need for a browsable archive of my saved pages, I usually order them by subject when saving.