Cool. Thanks. He’s checking in on the Archive-community process.

So archive.org will pick up the images and zip files as well? That would be so cool.

That’s my understanding. I’ll confirm as I discuss further with them. Waiting on a reply now.

2 Likes

curious if anything came of this. i don’t see any new archive.org refreshes on the old forum since march-- and there is certainly not a complete archive of the threads.

did you get any response? thanks!

3 Likes

No response since I last checked in with him, but he’s got two little kids. I’ll shoot him a note tomorrow. I also may see him Friday.

well, i missed that thread when it happened, but i just wanted to underline that “old” forums, especially those focussed on a specialty topic, like the former monome/community should not ever be deleted.
There is an invaluable amount of information and knowledge in there, contributed by many people. Google search often directs me there as a matter of fact, and it has more than once been to dig up some “ta-dah”-inducing info.

I have in mind two forums that revolved around the broadcast industry, that have gone in a blackhole a few years ago. Those forums were gathering total noobs, amateurs, seasoned professionals, and engineers of world class manufacturers and media networks. They were cluttered, but well indexed and the mountains of knowledge in there helped me become a better professional in my field. Reference books are one thing, but the shared experience, the tiny details, are what really makes a difference.
When a forum goes away, also goes with it the willfullness to contribute of people who invested time in sharing, and the possibility for newcomers to better apprehend a field and start on the shoulders of those who came before.

As for how to archive, given the context i would think that converting the whole thing into static files is the simplest and most future-proof way, as it “only” requires to maintain a simple file server. No idea how much space it would take though. (I just launched an httrack process, so i guess i will know in a few hours/days.)

3 Likes

please share your experiences with this! i absolutely want to preserve the forum-- it’s just in a state that is very expensive per month, given the fact that it “acts” like a static web page right now-- the search is so broken with the forum software as to be useless. i’ll investigate httrack

2 Likes

I posted instructions for archiving to flat files using wget above. httrack is new to me, but if it works, great.

Just chiming in to agree that flat files is the way.

httrack smoking two servers at once, this will be running for a very long time…

4 Likes

thanks, this is good work

oh, sorry i didn’t report back earlier on this.
The httrack session i made took a few days to complete then i didn’t find time to check if it looked alright. I had to restart it once to override the default limit on the maximum number of links it follows.

The resulting folder weights around 13GB. It sits in a dedicated server for now.

1 Like

update.

files moved and hosted on the new monome server. old server shut down and saving $$ now.

need some help tuning the apache mod_rewrite.

httrack turns everything into .html files, so old google searches are now busted

ie

http://archive.monome.org/community/discussion/17189/want-to-buy-a-monome-512/p1

is actually now:

http://archive.monome.org/community/discussion/17189/want-to-buy-a-monome-512/p1.html

this one is easy. however:

http://archive.monome.org/community/discussion/16913/raga-music#Item_17

becoming

http://archive.monome.org/community/discussion/16913/raga-music.html#Item_17

is harder. any suggestions?

should we just strip off anchor links?

eventually we should be able to just use google to search “keyword site:archive.monome.org” and potentially be able to get something useful out of it. i mean, heaps of misinformation, but maybe something useful also.

1 Like

Is this what you are looking for?
http://htaccesscheatsheet.com/htaccess.php?tip=alias-clean-urls

#anchor links are not sent to web servers, they are local web browser only. I suspect for the purposes of mod_rewrite you can ignore them.

3 Likes

Cool. Still no word from Archive. Not sure what’s up there.

ok. it’s all fixed. overall not too many hours of work, but enough. thanks for your help everyone.

past preserved.

10 Likes

ok, modrewrite is not quite right:

RewriteRule ^/community(.*) http://archive.monome.org/community$1

this is great for backwards compatibility with .html-less links, but it breaks the base url’s:

http://archive.monome.org/community
http://archive.monome.org/community/

any suggestions to fix this but keep the fix for extension stripping?

How about this?

RewriteRule ^.*community(/|)$ http://archive.monome.org/community [L]
RewriteRule ^(.*)[^\.html]$ $1.html
```
This tester is very helpful:
http://htaccess.mwl.be/

ok, reviving this because it never got fixed, which makes the archive look just broken. that tester site doesn’t support the query unfortunately.

again here’s what i’m using

RewriteCond %{SCRIPT_FILENAME} !-d
RewriteRule ^([^.]+)$ $1.html [NC,L]

/community

../community.html

and

/community/

/.html

i’m hoping to simply have an overriding condition that the root is not treated by the condition.

So nothing I suggested worked? It’s hard to write rewrite rules without testing them. I don’t have access to your web server and content. That’s why I just limited myself to syntax the tester does support. The suggestions I made appeared to work in the tester. Maybe somebody else around here is a better apache/regex wizard than I am.