Motivation
Last week I was writing a Python script to make an automatic backup, and I decided to send me an email in
case of scp failure. I decided to use Python to send the email, possibly via GMail and I found this
interesting blog post: Sending emails via
Gmail with Python. I like Python, it’s a good programming language, but my heart (as a developer!) beats
for the Objective
Caml programming language.
So I decided to port the script presented in the post in OCaml. The result is this sendmail.ml.
Compiling the script
To compile the script you need four software components:
- the Objective Caml environment. You can download it from the INRIA site;
- Findlib, to make compiling very
simple;
- Ocamlnet: here is the home page of
the project;
- OCaml binding to the SSL library.
You can of course compile all this stuff, but every decent Linux distributions has all packaged. In
Debian you have to run the following command:
# aptitude install ocaml libocamlnet-ocaml-dev \
libssl-ocaml-dev ocaml-findlib
Now, to compile the script, issue the command:
$ ocamlfind ocamlopt -linkpkg -package \
netstring,smtp,ssl,str sendmail.ml -o sendmail
Before using it, remember to customize your name, email address, GMail user and password.
Code comparison
The first difference that jumps out at everyone confronting the two scripts is the number of lines: 41
lines for Python against 163 of my OCaml version. The difference is justified by the fact that the Python
standard library comes with an almost full featured SMTP client, with ESMTP and TLS capability. On the other
side Objective Caml has a very concise standard library, which includes essential modules and data
structures, but no “batteries” are provided out of the box. This is a precise design decision by INRIA and,
in some ways, I agree with them. Luckily the OCaml community is a source of excellent libraries and
bindings, like Ocamlnet by Gerd Stolpmann
and the SSL library binding, written by Samuel
Mimram. The first one is in particular the Swiss Army Knife for network oriented battles.
Since the SMTP client provided by Ocamlnet doesn’t include TLS capability I decided to stole the source
code and adapt it to my needs, to have a more comfortable and high level interface resembling the one
offered by the Python standard library.
So the different length is easily explained: 109 lines of code are devoted to the smtp_client class, and
the actual script is 54 lines long.
The forward pipe operator
All Turing complete computer languages are equivalent, but everyone knows this is only the theory and
everyone have a programming language of choice. Here are two examples of what you can do in OCaml.
The first is the pipe operator:
let (|>) x f = f x
Here we define a (very common in FP)
infix operator which simply inverts the order of its operands. What the frack is this? Very simple, we use it to invert the order of
a function with its last parameter so, if we want to compute the 3rd Fibonacci number we can write:
let fib3 = fibonacci 3
but also:
let fib3 = 3 |> fibonacci
This is not a style issue, we can define a simple infix operator that feeds a function with a value; we
can of course connect several functions together, like in a shell script with the Unix pipe operator,
transforming an ugly and difficult to be read call:
let result = func1(func2 (func3(x)))
into:
let result = x |> func3 |> func2 |> func1
In the sendmail.ml script, line 127, we read:
email_string |>
Str.global_replace new_line_regexp "\r\n" |>
Str.split crlf_regexp |>
List.iter (fun s ->
self#output_string (if String.length s > 0 && s.[0] = '.' then
("." ^ s ^ "\r\n")
else s^"\r\n"));
Here we take the string containing the email, we replace all new lines with the sequence “\r\n”, split
the stream into lines and in the end send each line to the SMTP server, taking care of quoting each line
starting with a period. In 6 lines of code.
Algebraic data type
Algebraic data type are a very
interesting aspect of functional programming. We can easily wrap two heterogeneous data types into
a single one with two line of code:
type socket =
| Unix_socket of Unix.file_descr
| SSL_socket of Ssl.socket
The smtp_client class contains a reference to the connection handle used for communicating with the
server which is a plain file descriptor or an SSL socket, which one depends on the state of the
communication. I do not want to create a virtual class or an interface and two implementing
class as I should do in horrible languages like Java, spending half an hour deciding which methods to put in
the public interface, and so on; after all, it’s only a file descriptor!
Now I have a new type which is a disjoint
union of the two original types and I can write code like this (line 54):
let input = match channel with
| Unix_socket s -> Unix.read s
| SSL_socket s -> Ssl.read s in
Here we say: if channel is actually a Unix file descriptor, let’s define a new function “input” which is
the standard function “read”, from Unix module, otherwise, if channel is an SSL socket, let’s define “input”
as the Ssl.read function, which works only in ciphered sockets. From now on I’ll use input instead of one of
the two original functions.
Ok, it’s time to stop the waffle. Enjoy the script if you need, it’s completely free, like in free beer,
in free speech and even in free sex! :-)