/\ / \ (C) Copyright 2006 Parliament Hill Computers Ltd. \ / All rights reserved. \/ . Author: Alain Williams, April 2006 . . SCCS: @(#)Documentation 1.5 04/03/08 09:57:53 . Overview ******** MailToUrl is to help those people who send email with large attachments - which is a bad idea. Instead they send mail to a MailToUrl mailbox that saves the attachments and replies with a URL that can be pasted into email that is sent. The mail recipients can, if they wish, read the attachments by clicking on the URL. Why not send out large attachments ? ************************************ * It clogs up mail servers - especially if it is sent to a popular mail list * It fills up mail boxes - many people have quota limits on their mail boxes * It can cost people money to download large email - many still use slow modems * Some (corporate) sites filter out many attachments - so it never gets there * Some users can't read attachments anyway (think Blackberry users) Summary of features ******************* * Receive mail with an attachment and reply with a URL * Auto delete (purge) attachments (files) beyond an expiry date * Different attachments can have different expire times * Allow the original sender of an attachment to remove (delete) the attachment * Allow users to list what files are in the archive * Enforce minimum and maximum attachment sizes * Enforce a quota -- maximum amount of disk space used by all of the attachments * Attachment names are filtered to avoid problems that strange characters can cause Alpha Numeric and '.' and '-' (no spaces), max 32 characters. The big picture of how MailToUrl is set up ****************************************** You need to be able to: a) automatically intercept email for a user and filter it through MailToUrl b) have control of a Web or FTP server 1) You set it up on a server with a good Internet connection 2) You choose if you want to generate FTP (anonymous File Transfer) URLs or HTTP (World Wide Web) URLs. A URL is an Internet address for a document or file. 3) You set up a directory where MailToUrl can create files. 4) Choose an email address where users can send to. 5) You arrange for email for the address to be processed by MailToUrl. 6) Create a .list file in the archive directory 7) Arrange for MailToUrl to be run with the purge option once a day. MailToUrl is written in the Perl scripting language. The above is easy on a Unix or Linux system, let me know if you achieve this on anything else. Security Warning **************** Simple authentication is done using the email sender address, this is notoriously easy to forge. The authentication is used to only permit the original author of a file to delete it before it's expiry date. It is possibly that a malicious person could remove the original content and replace it with something else. This program should NOT be used in situations where a high confidence in the assured contents of the files is required. The original author will receive a confirmation mail informing him of the file delete/replacement. This will hopefully prompt him to investigate what has happened. It may be possible to increase sender authentication using something like a SPF check in the MTA. Choosing URL type ***************** The obvious URL type is HTTP (Web protocol). Most users will be able to download HTTP using their desktop browser. Browsers should also support FTP, however some corporate firewalls may try to block this. There is a potential security risk with HTTP for the web server machine. MailToUrl allows all file types to be uploaded, unless the web server is properly configured it might interpret one of these files as a web scripting program and try to run it, eg: prog.php, prog.cgi, prog.pl. There are also cross site scripting dangers. A FTP server just serves up files, no special interpretation is applied. If you are concerned about safety I suggest choosing FTP. Web Server Configuration ************************ If you do want to use a web server the following may help. It is an apache virtualhost definition. The important bit is 'AddType' which overrides the normal interpretations of problematic file types. This may not be sufficient, take care: ServerAdmin root@example.com DocumentRoot /var/www/files ServerName files.example.com:80 ErrorLog logs/files/error_log CustomLog logs/files/access_log common Options FollowSymLinks -ExecCGI -IncludesNOEXEC # Stop PHP scripts, etc, being executed: AddType text/plain .php .pl .cgi .html .shtml .htm .xhtml .xht .css .sgml .sgm .xml .xsl FTP Server Configuration ************************ There are so many different FTP servers out there that it is not possible to give sample configuration files. However there are a few things that you should think about. * Do you want anonymous ftp access. If 'yes' then anyone who knows a URL can download a file. If 'no' you need to distribute a username & password to the mail recipients. * Do you want the directories to be browsable (with dir) ? If you allow it then anyone who discovers the web site can easily see, and download, all of your documents. Disabling directory listings helps privacy - but does not guarantee it. * Do you want visiting users who download files to be able to upload other files ? Probably not, so disable remote uploading to these directories. * You probably don't want the .list file readable. Integration with Email ********************** You need to arrange for email for an appropriate user (or group of users) to be passed to MailToUrl. You need to give as arguments to MailToUrl: * mailbox name - IE the local part of the mail address that received the attachment * sender - who sent the email * subject - the Subject header from the email * Attachment storage directory name - optional, if the default is not used. Parts of the exim MTA (http://www.exim.org/) configuration file follow. This file will probably be called /usr/exim/configure or /etc/exim/exim.conf This is for exim version 4. This is where attachments are sent to save@files.example.com In the routers section: # This is a user where attachments will be put up for web serving mail_to_url_router: driver = accept local_parts = save domains = example.com transport = mail_to_url_transport no_more In the transports section: # Mail sent to this address is piped through MailToUrl mail_to_url_transport: driver = pipe command = /usr/local/bin/MailToUrl $local_part $sender_address $header_subject user = mail group = apache return_fail_output = true Notes: * You have put MailToUrl in /usr/local/bin/ * You may nee to change the user or group as appropriate * You may need to add something to the ACL to allow mail to be received for this mailbox. * The configuration relies on exim sanitising $local_part - MailToUrl only allows it to contain alphanumeric characters. Do NOT remove this restriction, if it were to contain '/' this might allow unwanted access to other directories. Here is a more complicated version. * Mail can be sent to several users @files.example.com, each of which is configured separately * Who can use the service is restricted The users allowed to use a service are listed in a file, eg: for archive@files.example.com there is the file /var/spool/exim/file_domains/archive: .dir /var/ftp/files/archive john@example.com bilbo@shire.com (The .dir line see later) There will be one such file for each mailbox in the domain files.example.com The router: # This is a domain where attachments will be put up for web/ftp serving # The local part is used to determine what organisation this is being done for, # each org will have a separate directory. # The sending user must be listed in the file for the local user. mail_to_url_router: driver = accept domains = files.example.com require_files = /var/spool/exim/file_domains/$local_part senders = lsearch;/var/spool/exim/file_domains/$local_part transport = mail_to_url_transport no_more If a sender is not listed in /var/spool/exim/file_domains/$local_part mail will be bounced. If the domain accepts regular mail as well, you should replace the senders line thus: senders = ${if exists {/var/spool/exim/file_domains/$local_part}\ {lsearch;/var/spool/exim/file_domains/$local_part}{*}} We allow each mailbox to have a separate directory, this is looked up in the .dir line in the sender validation file. The transport: # Mail sent to this address is piped through MailToUrl mail_to_url_transport: driver = pipe command = /usr/local/bin/MailToUrl $local_part $sender_address $header_subject ${lookup{.dir}lsearch{/var/spool/exim/file_domains/$local_part}} user = mail group = apache return_fail_output = true Procmail ******** It should be possible to run MailToUrl from procmail. Please send me the recipe if you do this. The .list file ************** This file is in the archive directory and contains: * A list of the files in that directory, who send them there and when the files expire. The fields in this line are: Filename EmailOfOwner ExpireTime-Integer ExpireTime-Text eg: notes.pdf addw@phcomp.co.uk 1148228772 Sunday 21 May 2006 These lines are added/removed by MailToUrl. * Configuration directives. All of these have names that start with a '.' and they have one value in the second field, eg: .Quota 20480 Note that the fields in this line are separated by tab characters, if you edit this file by hand you MUST preserve the distinction between spaces & tabs. This file must be writable by MailToUrl, you probably do not want it readable by the web or ftp server. A good starter for this file for files@files.example.com might be (indented): .ProtoRoot files/ .Domain files.example.com .Protocol ftp .MaxSize 4000000 Configuration directives ************************ These are found in the .list file in the archive directory, they are: .Protocol The protocol that will be used to serve the file, this makes up the first part of the URL, default: 'http' .Domain The domain (machine name) that will be given in the URL, default the server's host name. .ProtoRoot A path that needs to be prepended to the mailbox name (email local part) to get when accessing it via the protocol, default ''. .MaxSize Files larger than this will not be archived. Units: bytes Default: 2 Megabytes. .MinSize Files smaller than this will not be archived. Units: bytes Default: 2 Kilobytes. .FileDir The archive directory (where files are served from). This allows the .list file to be in a separate directory. Default: $FileDirRoot/$mailbox $FileDirRoot is /var/www/files This may also be specified as the optional last parameter to the MailToUrl command line. .Quota The maximum size that the archive may be, units Kilobytes, default 2048 (IE 2 Megabytes). If this is 0 no quota limit is enforced. .KeepDays The default number of days for which a file will be kept. Default: 31 .WarnPurge A file owner can be warned several days in advance that a file will be removed. Set this to the number of days, 0 switches this off. The owner will only be sent one warning, the way that this is done assumes that the purge run is done once a day. Default: 0. .SayPurged If set a file owner will be told if a file of their's has been purged. If present the value should be numeric. A value of '0' will mean that the owner will not be told, any other value result in email being sent to the file owner. Default: 0. .MailFrom The envelope from address for mail sent to the attachment sending user. This should not be the mailbox, if it were there is a risk of a mail loop where machines bounce mail between them. Default: $mailbox-bounce@$domain .MailReplyTo The Reply-To: address in the header. The default is the receiving mailbox, so that when (not 'if') a user 'replies' to email it comes back to MailToUrl for processing. .LogFile Where a record of transactions and actions is kept. Default: /var/log/$ProgramName, ie /var/log/MailToUrl .WorkDir A temporary directory where files are created. Default: /var/tmp It might help if you know that the URL generated is: $protocol://$domain/$ProtoRoot$mailbox/$AttachmentName The $mailbox is the local part of the email address. MailToUrl Command Line ********************** Help: MailToUrl --help Remove expired attachments from Web/Ftp page: MailToUrl --purge mailbox [File Directory] Upload new attachments, mail on stdin: MailToUrl mailbox sender subject_command [File Directory] The email Subject Line ********************** This may be used by the sending to specify options or give simple commands: on uploading an attachment to be saved as a file, how long before it is removed, eg: keep 2 days keep 2 weeks keep 1 months keep 1 year The original sender of a file can send a replacement file, or perhaps the same file with a different keep date. Only the original sender can update the file. to see what files are present: list to remove a file called FileName: delete FileName Only the original sender of the file can delete the file. Purging ******* The easiest way to do this is to run it from cron: # Purge expired files from the mail file store: 30 4 * * * /usr/local/bin/MailToUrl --purge film_club 32 4 * * * /usr/local/bin/MailToUrl --purge chess_club /var/ftp/files/chess_club The second one specifies the archive directory since it is not in the default location (under /var/www). File system location and permissions ************************************ The directory containing files that list users of the service is given as /var/spool/exim/file_domains/$local_part, another good choice might be /usr/exim/file_domains/$local_part. The files in this directory must be readable by the MUA (eg exim). In this case the permissions should be for the user 'exim', check also the user in 'mail_to_url_transport'. The FTP area should be readable by the anonymous FTP user, a good place to put these would be /var/ftp/files/$local_part/ (ie a separate directory for each address). The directories should be writable by the MTA (or whatever user is set up in the exim transports section). You probably don't want these files and directories to be searchable by the anonymous ftp user, you achieve this my making the directories executable but not readable to that user, eg mode 751. Eg: # l -a /var/ftp/files total 22 drwxr-x--x 11 mail root 4096 Jun 29 2007 . drwxr-xr-x 4 root root 4096 Nov 17 04:43 .. drwxr-x--x 2 mail root 4096 Feb 17 04:30 files drwxr-x--x 2 mail root 4096 Jun 29 2007 friends This means that someone has to be told the URL of a file to be able to download it. By no means perfect security, but better than nothing. You might decide that a directory is generally readable. The default log file is: /var/log/MailToUrl this needs to be writable by the user that the script runs as. Portability Notes ***************** MailToUrl is written in Perl, however the 'du' command is used to determine disk usage. The Perl modules listed below are used, you should already have them installed, except possibly MIME::Parser. MIME::Parser; Data::Dumper; File::Copy; POSIX qw(strftime); Fcntl ':flock'; Sys::Hostname;