DSpam can be configured in many different ways, so there are a plethora of guides all accomplishing the same thing in a different way.
Yes, looks like it

If somehow possible I want to find a setup that also fits as much as possible into the general upstream structure. It is clear that people are bound to modify their Kolab servers but the base configuration should - if at all possible - always be sound and simple.
Many of them are directed towards POP3. With pop3 you'll often have a web interface with all mail quarentened as spam for ham retraining and a email address to send your spam for spam retraining. I find that IMAP is a much better alternative since you can have folders and have all the work done behind the scene. Just drag and drop
Yes, I agree.
I have two cron scripts running every minute. One checks for new users and creates the spam, block, and unblock folders, then assigns the correct permissions.
This is very useful and I would like to integrate something similar upstream. perl-kolab is currently doing the user setup anyhow and I think it might be possible to automatically create the spam folders upon user creation. I will have to discuss this on kolab-devel though and get feedback.
In any case these spam folders should get folder annotations. The Kolab format currently provides "mail.junkemail" as annotation. Maybe we should just change that to "mail.spam" and "mail.ham".
Using folder annotations rather than your fixed structure has the advantage that the sysadmin can also deviate from the per-user scheme and easily create shared folders for the same purpose.
The other scans each users block & unblock folder for corrections, retrains dspam, and cleanup work.
Did you post that somewhere? I was aiming at generating a python script for the bayes training. I already have one for the spamassassin bayes stuff but this needs to get rewritten.
I've only had three issues thus far.
Toltec's MIME parser has issue with incorrectly headers which causes it to download messages but never display or modify them. I solved this problem by automatically deleting spam older then 30 days. I'd think there would be a way to fix malformed headers in postfix but thats a project for another day.
I don't fully understand where these "incorrect headers" come from?
Postfix Content Filters Scan incoming and outgoing email, I need to figure out how to only have it scan incoming
Ok.
Thunderbird doesn't expunge mail automatically so if you drag a message into block, then move it to unblock it gets trained twice.. not sure how to overcome that one.
Your script should disregard expunged messages. So I assume that the script is working on file level rather than using IMAP access?
The script I had in mind should scan the Kolab IMAP server for folders of type "mail.spam" and "mail.ham" than can be accessed by "anyone". It would then fetch any existing messages, pass them through the training and purge the folders.
Otherwise besides those three issues it's been smooth sailing with dspam once it was configured.
I don't know if my method is what you had in mind for spam filtering, but let me know.
Besides the stuff mentioned above the only other two things where I'd like to deviate from your approach would be:
- no MySQL db
- integrated into amavisd-new
A MySQL db on a Kolab server is a no-go because of philosophy reasons

The server should stay as lean as possible.
The amavisd-new approach helps to isolate the spam handling on a single spam box that can be accessed by different servers. I still don't know if having dspam in amavisd-new will cripple it too much but I'll have a look.
Thanks for all your input!
Cheers,
Gunnar